Phases of Optimization – I (Pre-Optimization Phase)


To achieve the goal of optimal performance, teams need to approach Optimization in a structured manner. There are three distinct phases in the Optimization process. They are:

a)    Pre-Optimization Phase: This phase is the foundation of optimization and is characterized by the activities such as data collection and analysis.

b)    Optimization Phase: This phase is the implementation phase and is characterized by coding activities, reviews and design related changes.

c)    Post-Optimization Phase: This phase is the benchmarking phase where the optimization goals are re-assessed and compared against industry standards and other products/versions.

In this post, we take a look at Pre-Optimization phase in a detailed manner.

The Pre-Optimization phase lays the foundation to the Optimization process by first collecting the data to baseline the current performance of the system and then providing the basis for analysis to re-design the system and also to set optimization goals.

Data Collection

Some of the key aspects that are to be considered while collection of data for optimization is:

a)    What are the bottlenecks for the system?

b)    What are the time-consuming components of the system?

c)    What is the memory consumption of the system at various stages?

d)    Is the system CPU-centric, File-centric or Network-centric?

e)    What is the performance of an existing prototype?

Data collection can be done manually or through tools. Profilers such as Glowcode and V-Tune are excellent tools for collecting data for optimization.

Fig 1: Sample Glowcode Report

Glow Code Report

Glow Code Report

Some of the key aspects from the glowcode report above are:

a)    Critical Path: The blocks marked in Red represent a critical path in the system that can then be picked up for optimization.

b)    Time: The time represents the total time taken by the component/function (including multiple invocations of the function) during the execution of the system/application.

c)    Visits: This represents the total number of times a function has been visited or executed during the life cycle of the system/application.

d)    Avg. Time: This represents the average time taken by the function/component per visit or execution.

e)    Memory Statistics: This represents the Gross and Average memory allocated and de-allocated from the system during the execution.

Note: As Profilers are not tools for memory leak detection and are primarily performance tools, the memory report generated from them should be strictly used only for a general assessment of memory usage during the system execution and not for identifying memory leaks.

The rule of thumb with such data is to identify the critical path, then identify the most time consuming components/functions, then identify the most visited components/functions and then look at the memory intensive components/functions.

Manual data collection is generally done to identify parameters that are not easily evident from an automated tool report. For e.g. though the report will mention that some component is memory intensive, it might not be able to relate it with an image Volume size. Such information need to be collected manually or by dumping them from the code. Some of the useful data collected manually are Data Flow Diagrams and Data Flow Charts.

Fig 2: Sample Data Flow Chart

Sample Data Flow Chart

Sample Data Flow Chart

Data Analysis

The next step after the data collection is the Data Analysis. In this step, the inputs collected are used to define the optimization strategy for the system. Some basic optimization questions to be considered during analysis are:

a)    Do we need to re-design Architecture?

  1. Is threading required?
  2. Is architecture too bulky & complex?
  3. Are activities CPU-Centric, File-Centric or Network-Centric?

b)    Do we need to re-design individual components?

  1. Is this component required?
  2. Can this component be streamlined?
  3. Is there a more optimized flow for this component?

c)    Do we need to re-design the data flows?

  1. Does the data flow logically sequenced?
  2. Is the data available at particular junctions really required or can be released?
  3. Is pre-loading of data required?
  4. Is the same kind of data being loaded at multiple places?

d)    Can we eliminate unnecessary code?

  1. Are there any unused code/functions?
  2. Are there redundant code/functions?

Artifacts from Pre-Optimization Phase

Some sample artifacts from Pre-Optimization Phase are:

1)    Tool reports like Glow Code Report

2)    Data Flow Diagrams

3)    Performance Reports on the prototype.

Once these two activities i.e. Data Collection and Data Analysis are completed we are ready to go onto the next phase, the Optimization phase.

Morals of the Story

1) Data Collection is required for base-lining the current system performance and identifying areas of optimization.

2) Data Analysis is the interpretation of this data to prepare the optimization strategy for the system.

Cheers!

Ram

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s