02 Intel VTune Session 02
Transcript of 02 Intel VTune Session 02
-
7/29/2019 02 Intel VTune Session 02
1/24
Installing Windows XP Professional Using Attended Installation
Slide 1 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
In this session, you will learn to:
Identify time and space complexity
Optimize programming constructs
Identify performance libraries for optimization
Objectives
-
7/29/2019 02 Intel VTune Session 02
2/24
Installing Windows XP Professional Using Attended Installation
Slide 2 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
An algorithm is a set of finite steps that accomplish a
specific task.
The efficiency of an algorithm is measured in terms of the
processor time and the memory space that the algorithm
utilizes.It is necessary to tune the algorithm to make optimal use of
available resources, such as processor time and memory.
To analyze algorithmic complexity, it is important to
understand the time and space aspects of complexity.
Examining Algorithms
-
7/29/2019 02 Intel VTune Session 02
3/24
Installing Windows XP Professional Using Attended Installation
Slide 3 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
The time taken to compute the steps involved in an
algorithm is called the time complexity of that algorithm.
The time taken to execute all the steps for a worst-case
scenario is the time complexity of the entire algorithm.
The execution of steps in an algorithm depends on theconditions specified in it. As a result, the time complexity of
an algorithm is measurable with reference to the conditions
involved in the algorithm.
Identifying Time Complexity
-
7/29/2019 02 Intel VTune Session 02
4/24
Installing Windows XP Professional Using Attended Installation
Slide 4 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Computing the nth number of the fibonacci series:Step 1: start
Step 2: input the value of n
Step 3: if (n < = 1) then go to step 14
Step 4: x = 0
Step 5: y = 1
Step 6: write (x + + y)
Step 7: for (i = 0 to n-1)
Step 8: {f = y + x
Step 9: x = y
Step 10: y = f
Step 11: i = i + 1
Step 12: write (f) }
Step 13: go to step 15Step 14: write (n)
Step 15: stop
Identifying Time Complexity (Contd.)
-
7/29/2019 02 Intel VTune Session 02
5/24
Installing Windows XP Professional Using Attended Installation
Slide 5 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Based on the value of n, two cases can exist in the
preceding algorithm.
If the value of n is less than or equal to 1, time complexity is
constant and does not depend on input, which is n, as
shown in the following table.
Identifying Time Complexity (Contd.)
Statements Frequency o f Execut ion
Step 2: input the value of n 1
Step 3: if (n < = 1) then go to step 14 1
Step 14: write (n) 1
Total Number of instructions executed 3
-
7/29/2019 02 Intel VTune Session 02
6/24
Installing Windows XP Professional Using Attended Installation
Slide 6 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
If the value of n is greater than 1, time complexity is 4n-2.
This is shown in the following table.
Identifying Time Complexity (Contd.)
Statements Frequency of ExecutionStep 2: input the value of n 1Step 3: if (n < = 1) then go to step 13 1Step 4: x = 0 1Step 5: y = 1 1Step 6: for i= 2 to n-1 repeat steps 7 to 10 1Step 7: f = y + x n-2Step 8: x = y n-2Step 9: y = f n-2Step 10: i = i + 1 n-2Step 11: write (f) 1Total number of instructions executed 4n-2
-
7/29/2019 02 Intel VTune Session 02
7/24
Installing Windows XP Professional Using Attended Installation
Slide 7 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Just a minute
What do you mean by the time complexity of an algorithm?
Answer:
The time taken to compute the steps involved in an algorithm
is called the time complexity of that algorithm.
-
7/29/2019 02 Intel VTune Session 02
8/24
Installing Windows XP Professional Using Attended Installation
Slide 8 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Space complexity is the amount of memory that a program
requires to accomplish a task.
Space complexity is a criterion to measure the efficiency of
a program.
For example, consider the following expression:Return [a + b + b * c + (a +b c) / (a + b) + 4.0]
If variables b and c are each of 2 bits, the storage requirements
for the preceding instruction will be constant.
If the values of a, b, and c are taken from user input, the
storage space required will vary.
Identifying Space Complexity
-
7/29/2019 02 Intel VTune Session 02
9/24
Installing Windows XP Professional Using Attended Installation
Slide 9 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Refer to the following algorithm:Step 1: start
Step 2: function Add (k, m)
Step 3: l = 0
Step 4: for j= 1 to m repeat step 5
Step 5: l = l + k[j]
Step 6: return the value of lStep 7: stop
For the preceding algorithm:
The space taken by the k array is m units.
The other variables, m, j, and l will take only one unit of space
each.
As a result, you can obtain the space complexity of this
algorithm by adding the space utilized by all the variables.
Identifying Space Complexity (Contd.)
-
7/29/2019 02 Intel VTune Session 02
10/24
Installing Windows XP Professional Using Attended Installation
Slide 10 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
An application developed in any programming language,
such as C, C++, C#, or Java, is based on an algorithm.
Each algorithm consists of several programming constructs,
such as loops, decisions, and functions.
The performance of an application is related to theprogramming constructs used in the application.
To achieve the desired level of optimization, it is important to
examine the loops, branching statements, and function calls
used in the program.
Examining Programming Constructs
-
7/29/2019 02 Intel VTune Session 02
11/24
Installing Windows XP Professional Using Attended Installation
Slide 11 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Loops can be optimized by:
Removing unwanted parts of loops
Combining loops
Using unrolling
Reducing work inside loopsUsing sentinel values
Looking at the order of loops
Looking at operators
Examining Loops
In this technique, you need to
first identify the
decision-making steps present
inside a loop.
After identifying these steps,
you need to decide whether theloop affects the steps in any
way.
If the loop does not affect the
steps, remove the unaffected
steps from the loop.
When multiple loops work on
the same variables, you may
combine the loops.
This helps reducecomputation time because the
total number of instructions
executed reduces.
Unrolling refers to the process
of breaking compact loops
into simpler statements.
The values of someexpressions, variables, or
constants do not change
inside the loop and
unnecessarily consume
processor time.
Therefore, a better approach
is to bring them outside the
loop.
A sentinel value is a valuethat is placed at the end of a
search range.
Sentinel value prevents you
from performing the additional
task of checking the end of
the input search string.
You can improve the
efficiency of a program by
changing the order of loops.
In terms of efficiency,
operations such as
multiplication and division are
more expensive thanoperations such as addition.
You should try to convert all
expensive operations with
cheaper ones.
-
7/29/2019 02 Intel VTune Session 02
12/24
Installing Windows XP Professional Using Attended Installation
Slide 12 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Examining Functions
As the frequency of calling a function increases, program
execution time increases.
Examining functions is important for performance
optimization.
Functions can be optimized by:Using faster functions
Identifying mathematical functions
Identifying standard functions
Declaring local functions as static
Try to use only fast functions.
You can identify fast functions
by knowing the time
complexity associated with
the function.
You need to be selective
about mathematical functions,
such as square root, used in
a program.
You can sequence
mathematical functions based
on execution time.
Use standard mathematical
approaches to compute a
result that uses complex
calculations.This enables you to solve a
problem more efficiently.
Using static functions,
evaluation is faster and
efficiency is improved.
-
7/29/2019 02 Intel VTune Session 02
13/24
Installing Windows XP Professional Using Attended Installation
Slide 13 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Transferring control from one part of code to another is
achieved through branching.
Various techniques can be adopted to make the process of
branching effective and improve the efficiency of code.
Branches can be examined by:Removing the else clause
Using effective case statements
Replacing conditional computations
Examining Branching
Using an else clause with
every if loop leads to
ineffective branching.
Therefore, you should try to
remove the else clause, if
possible.
You should use effective case
statements in such a way so
the order of options is defined
based on the frequency of
using the options.
You can save computational
time by replacing conditional
computation with equivalent
arithmetic expressions.
-
7/29/2019 02 Intel VTune Session 02
14/24
Installing Windows XP Professional Using Attended Installation
Slide 14 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
General Guidelines for Optimization
Some guidelines to follow when writing code are:
Identify optimization areas
Identify the depth of optimization
Identify correct alternatives
Identify what is being asked
-
7/29/2019 02 Intel VTune Session 02
15/24
Installing Windows XP Professional Using Attended Installation
Slide 15 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Common Misconceptions About Optimization
Some of the common misconceptions about code and their
optimization are:
To think a program doesnt require optimization because it
appears to be quite fast.
To think that only the optimization performed by the compiler is
enough.
To believe that short code is efficient.
To think a specific solution will be effective without verifying
performance results.
To think optimizing while programming is good practice.
-
7/29/2019 02 Intel VTune Session 02
16/24
Installing Windows XP Professional Using Attended Installation
Slide 16 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Just a minute
What is unrolling?
What is sentinel value?
Answer:
Breaking compact loops into simpler statements is called
unrolling.
A sentinel value is a value that is placed at the end of a search
range. This assures the termination of the search.
-
7/29/2019 02 Intel VTune Session 02
17/24
Installing Windows XP Professional Using Attended Installation
Slide 17 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Using Performance Libraries For Applications
Each software developer has an individualistic style of
writing code. It is possible that the written code will not be
efficient.
Sections of software, such as the menu bar, remain
common in most software.
Writing these sections of codes repeatedly makes the
process time-consuming and error-prone.
In such situations, you might prefer to use existing code,
which is called a performance library.
This piece of code is already checked for errors. In addition,this code attains a high level of optimization over a period.
-
7/29/2019 02 Intel VTune Session 02
18/24
Installing Windows XP Professional Using Attended Installation
Slide 18 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Using Performance Libraries For Applications (Contd.)
The advantages of using performance libraries are the
following:
Enable programmers to develop code in less time
Offer error free codes
Enables best use of resources
Help in performance improvement
Add to stability of the functioning software application
-
7/29/2019 02 Intel VTune Session 02
19/24
Installing Windows XP Professional Using Attended Installation
Slide 19 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Identifying the Types of Performance Libraries
Based on these tasks, performance libraries can be
categorized as:
Engineering and science libraries
Math libraries
Graphic libraries
Audio/Video libraries
Image processing libraries
Other libraries
These libraries are mainly
used in scientific and
engineering applications.
These libraries may include
functionalities for searching,
sorting, and evaluating
various expressions.
These libraries deal with the
evaluation of complex
mathematical functions, suchas vector and matrix
calculations.
These libraries help draw
graphics, pie charts, graphs,
and bar diagrams correctly
and accurately.
These libraries help optimize
various functions related to
audio-visual data.
These libraries help process
images faster.
Other libraries may include
functionalities for performing
various tasks, such as speech
recognition, signalprocessing, and cryptography.
-
7/29/2019 02 Intel VTune Session 02
20/24
Installing Windows XP Professional Using Attended Installation
Slide 20 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Just a minute
Which libraries may include functionalities for searching,
sorting, and evaluating various expressions?
Answer:
Engineering and science libraries
-
7/29/2019 02 Intel VTune Session 02
21/24
Installing Windows XP Professional Using Attended Installation
Slide 21 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Summary
In this session, you learned that:
Time complexity is the time taken by the steps of an algorithm
to execute.
Space complexity is the measurement of the space utilized by
the components of an algorithm.
Identifying the data structures used in an algorithm and then
adding them can measure space complexity.
Loops are generally the most time-consuming constructs of a
program.
Some of the techniques to optimize loops are:
Removing unwanted parts of loops
Combining loops
Using Unrolling
Reducing work inside loops
-
7/29/2019 02 Intel VTune Session 02
22/24
Installing Windows XP Professional Using Attended Installation
Slide 22 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Summary (Contd.)
Using Sentinel Values
Looking at the order of loops
Looking at operators
Some of the techniques to optimize functions are:
Using faster functions
Identifying mathematical functions
Identifying standard functions
Declaring local functions as static
Transferring control from one part of a code to another is
achieved through branching.
-
7/29/2019 02 Intel VTune Session 02
23/24
Installing Windows XP Professional Using Attended Installation
Slide 23 of 24Ver. 1.0
Code Optimization and Performance Tuning Using Intel VTune
Summary (Contd.)
Some of the techniques to make the process of branching
effective are:
Removing the else clause
Using Effective Case Statements
Replacing Conditional Computation
You should keep the following optimization guidelines in mind
when writing codes:
Identify optimization areas
Identify the depth of optimization
Identify correct alternatives
Identify what is being asked
-
7/29/2019 02 Intel VTune Session 02
24/24
Installing Windows XP Professional Using Attended InstallationCode Optimization and Performance Tuning Using Intel VTune
Summary (Contd.)
Some of the common misconceptions about optimization are:
A fast program does not require optimization
Optimization performed by the compiler is enough
A short code is efficient
Optimizing while programming is good practice
Performance libraries can improve the performance of an
application to a large extent.
A variety of performance libraries are available for different
purposes, such as mathematical, graphical, and task-based
functions.