Intel VTune Documentation - DotNetSpider

burgerraraΛογισμικό & κατασκευή λογ/κού

18 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

83 εμφανίσεις




By :
-

Vikram Singh Saini

ISAS

“ Intel VTune is software from
䥮瑥氠䍯牰o牡瑩on⸠䥴⁩猠扥Vn
u獥搠批VTev敬opm敮琠
捯mp慮楥猠io⁡湡 yz攠瑨敩爠
獯晴睡牥w⁡湤⁡灰汩捡瑩cnV
p敲景rm慮捥c

䅬ong⁩琠慬ao⁰ ov楤e猠慤Vic攠
瑯⁢oo獴Vp敲eorm慮捥f⁡pp猬V
on
specific OS.”

Page
2

of
24




Acknowledgment

I am
grateful

to
my

parents, relatives and friends who
help

us in documentation of Intel
VTune ISAS.

I am

also grateful to our faculty
Mr. Ram
Sharma (
GL


NIIT ABC)

for his keen guidance and
support while documentation of this document in your hands.
I

also
want

to thank God who
gives knowledge and potentiality to write this all documents clearly and present

it in front of
you readers.

During the documentation
I

feel lot of troubles such as crashing of Windows and damaged
to computer hardware, hindrance in working of software Intel VTune.
I

would also like to
mention the name of the
person,

our friend,
Sandee
p Chaudhary
, for providing his PC for
presenting this application.

And at last
I

would like to say a special thanks to NIIT who provides us a wonderful chance to
present this document in front of you all readers.

Thanks everybody who directly or
indirectly

helps for presenting this file.

-

Vikram Singh Saini




Page
3

of
24



Contents
at

Glance


1.

I
ntroduction
--------------------------------------------------------------------------
4

2.

Sampling

2.1.

Introduction
-----------------------------------------------------------------
7

2.2.

Sampling mechanism
-----------------------------------------------------
7

2.3.

Diff.between TBS & EBS
--------------------------------------------------
8

2.4.

What happens during Sampling
--
--------------------------------------
8

2.5.

Features of Sampling
-----------------------------------------------------
9

2.6.

Sampling Over Time
------------------------------------------------------
10

3.

Call Graph

3.1

Introduction
---------------------------------------
------------------------
12

3.2

Features of Call Graph
--------------------------------------------------
12

4.

Counter Monitor

4.1

Introduction
---------------------------------------------------------------
16

4.2

Features of Counter Monitor
---------------------
---------------------
16

4.3

Working of Counter Monitor
------------------------------------------
19

5.

Tuning Assistant

5.1

Tuning Assistant
----------------------------------------------------------
20

5.2

Tuning Assistant Concepts
------------------------------
---------------
20

5.3

Features of Tuning Assistant
------------------------------------------
20

5.4

Understanding Tuning Methodology
-------------------------------
21

5.5

Strategies for Improving Performance
-----------------------------
21

5.6

Types of Advice
----------------------------------------------------------
21

5.7

Information that Tuning Assistant provides
----------------------
22


6.

References
---------------------------------------------------------------------------
24











Page
4

of
24



Introduction

The
VTune analyzer provides an integrated performance analysis and tuning environment
that helps you analyze your code's performance on systems with IA
-
32, Intel(R) 64, and IA
-
64
architecture.

VTune analyzer can plug in into Microsoft Visual Studio and Eclipse

integrated development
environments.

One

can work with the VTune analyzer using the graphical interface and command line
interface. All commands to create and run Activities must be preceded by
vtl

at the
co
mmand line.

LINUX SUPPORT

The VTune(TM) Performance Analyzer can analyze the performance of your Linux*
application. The VTune analyzer is installed on a controlling system and controls the run of
your Linux application on a Remote Agent system. The VTune
analyzer then collects data on
your Linux application by collecting data remotely.

JAVA

SUPPORT

When the VTune(TM) Performance Analyzer analyzes the performance of your Java*
application or applet (.class), the Virtual Machine (VM) and Just
-
in
-
Time Compil
er (JIT) are
enhanced to provide the VTune analyzer with specific information required to analyze the
performance of a Java application.

During sampling, the VM and JIT provide the VTune analyzer with information about JIT
-
compiled Java methods being loade
d into memory, such as their memory addresses, sizes,
and symbol information.

.NET

SUPPORT

The VTune(TM) Performance Analyzer enables you to profile .NET* and ASP.NET web
services running on your machine.

The VTune analyzer will set the necessary environme
nt variables and restart the web service
before collecting sampling or call graph data. The environment variables will be deleted and
the service restarted on completing data collection.

Use the
sampling configuration wizard

and
call graph configuration
wizard

for profiling
ASP.NET/.NET web services.


Page
5

of
24


FEATUERS O
F

INTEL VTUNE PERFORMANCE ANALYZER

1.

CALL GRAPH



Provides a graphical view of the application and helps you identify
critical functions and timing details in the application.

2.

SAMPLING



Calculates the actual performance of an application over a period
(Time
-
based sampling) and for various processor events(Event
-
based sampling).

3.

COUNTER MONITOR



Provides system level performance, such as resource
consumption, during the execution of an a
pplication.

4.

TUNING ASSISTANT


Provides tuning advice from an anzlusis of the performance
data. The tuning advice helps you improve performance of an application.

5.

HOTSPOTS VIEW


Helps identify the area of code that takes the maximum CPU
time.

MINIMUM
REQUIREMENTS OF SOFTWARE

1.

HARDWARE REQUIREMENTS

Processors Supported:

Servers
:



Quad
-
Core Intel(R) Xeon(R) Processor 5300 Series



Dual
-
Core Intel(R) Xeon(R) Processor 5100 Series



Dual
-
Core Intel(R) Xeon(R) Processor 5000 Sequence



Dual
-
Core Intel(R) Xeon(R) Pr
ocessor 7100 Series



Dual
-
Core Intel(R) Xeon(R) Processor 7000 Sequence



Dual
-
Core Intel(R) Xeon Processor LV



Intel(R) Xeon(R) processor MP



Intel(R) Xeon(R) processor



Dual
-
Core Intel(R) Itanium(R) 2 processor 9000 sequence



Low Voltage Intel(R) Itanium(R) 2 P
rocessor



Intel(R) Itanium(R) 2 processor

Desktop
:



Intel(R) Core(TM)2 Quad processor



Intel(R) Core(TM)2 Extreme

processor



Intel(R) Core(TM)2 Duo processor



Intel(R) Core(TM) Duo processor



Intel(R) Core(TM) Solo processor



Intel(R) Pentium(R) D processor 900
sequence



Intel(R) Pentium(R) D processor



Intel(R) Pentium(R) 4 processor Extreme Edition



Intel(R) Pentium(R) processor Extreme Edition



Intel(R) Pentium(R) 4 processor

Mobile
:



Mobile Intel(R) Pentium(R) 4 Processor
-

M



Intel(R) Pentium(R) M processor

Page
6

of
24




Intel(
R) Celeron(R) M processor



Intel(R) Celeron(R) D processor



Intel(R) Celeron(R) processor



Mobile Intel(R) Celeron processor


2.

SOFTWARE REQUIREMENTS

32
-
bit operating systems supporting IA
-
32 processors:



Microsoft* Windows XP Professional Service Pack 2



Microsoft* Windows Server 2003 Enterprise Edition Service Pack 1



Microsoft* Windows Server 2003 R2 Enterprise Edition



Microsoft* Windows Vista*



Microsoft* Windows Server 2008 RC0 (build 6001)

64
-
bit operating systems supporting Intel(R) processors with Int
el(R) EM64T:



Microsoft* Windows XP Professional x64 Edition



Microsoft* Windows Server 2003 Enterprise x64 Edition



Microsoft* Windows Server 2003 R2 Enterprise x64 Edition



Microsoft* Windows Vista*



Microsoft* Windows Server 2008 RC0 (build 6001)

64
-
bit oper
ating systems supporting Intel(R) Itanium(R) architecture processors:



Microsoft* Windows Server 2003

Enterprise Edition Service Pack 1



Microsoft* Windows Server 2008 RC0 (build 6001)

3.

SYSTEM MEMORY REQUIREMENTS

At least 128 Megabytes of RAM

4.

DISK SPACE
REQUIREMENTS




At least 105 Megabytes of available space on a local drive



20 Megabytes of disk space is required for system files on the drive containing the system
directory (for example,
C:
\
)

Th
e
additional hard disk space is needed for updating and installing the DLLs and OCXs that the VTune
analyzer requires to be in the system directory.






Page
7

of
24



Sampling

INTRODUCTION

Sampling is the process of co
llecting a set of data for analy
sis and r
epresenting the analyzed
data in a statistical format.

Use the collected data to identify the critical processes, threads,
modules, functions, and lines of code running on system.

During sampling, the VTune(TM) Performance Analyzer monitors all the softwar
e executing
on your system including the operating system, JIT
-
compiled Java* applications,
.NET*
applications,
and device drivers.


Sampling does not modify binary files or executables in order to monitor the performance
of application. The VTune analyzer

analyzes the collected samples and
helps you to identify:
-

1.

Hotspots
-

Is a section of code within a module that took long time to
execute.

This results in high amount of processor time spent executing that
section, thus generating lot of samples for that module.

2.

BottleNecks
-

Is an area in the code that is slowing down the execution of
application.

Bottlenecks appears as hotspots in hotspot v
iew.

Removing
bottlenecks and hotspots optimize the application.

TWO TYPES OF SAMPLING MECHANISM TO COLLECT SAMPLING DATA

1.

TIME
-
BASED SAMPLING (TBS)

:
The VTune(TM) uses the operating system timer
to interrupt and collect samples of all active instruction
addresses at a regular time
interval (1ms. by default).

The collected samples provide the performance data of all
the processes running on the system. Processes that took the longest time to
execute have the highest number of samples.

2.

EVENT
-
BASED SAMPLING
(EBS) :

Use
to
identify system
-
wide software
performance problems caused by processor events, such as Cache Misses and
Mispredicted Branches.


From the EBS data,
one
can determine which process, thread, module, function, and
source line in program
generated the most processor events, and if any of those
events impacted the performance of program.

The VTune analyzer provides
predefined
event ratios

recommended for use by performance analysts at Intel.

Page
8

of
24



FIGURE 1: Event based sampling

DIFFERENCES B/W
TBS AND EBS

EBS


Data is collected using Clocktick events
. But when HLT instructions are
executed by processor clock, the processor clock causes the clockticks events to
stop occurring.

This results in no samples being collected while the processor is in
halt state.

The VTune will report few samples than you were expected.

TBS
-

Data is collected using OS timer. And OS timer is not affected during HLT
instructions.

And the samples are collected accuratelty.

TBS can potentially gives
more accurate data.

WHAT HAPPENS DURING SAMPLING



When you run an Activity configured with the sampling collector, the VTune analyzer does
the following:



Waits for the
delay sampling
time (if specified) to elapse and then starts collecting
samples.

Page
9

of
24




Interrupts the processor a
t the specified
sampling interval
and collects samples of
instruction addresses.

For every interrupt, the VTune analyzer collects one sample.



Stores the execution context of the software currently executing on system.

FEATURES OF SAMPLING


The following ar
e the main features of the sampling collector and views:

1.
Collection



Multiple event sampling.

Perform
event based sampling with multiple events
in one
run. Depending on the type of processor using, the VTune analyzer can monitor and
collect samples on tw
o or more events in one run.



Remote sampling.

Collect sampling data for an application running on a remote
system. Your remote system can be a machine running on any operating system
supported by the VTune analyzer.



Collect sampling data for applications running on systems enabled with
Hyper
-
Threading Technology
.

2.
Views

The following sampling views help you analyze the data:



Thread view.


View the threads running within a process and select one or more
threads to
drill down to specific hotspots.



Summary view.

Opens default for clocktick events.



Process view
. Display a system
-
wide view of all the
processes

running on your
system when sampling data was collected.



Module view
. Display all the modules within selected
threads.



Hotspot view
. Display function names associated with selected modules. Group
hotspots by function, related virtual address (RVA), source file, or class.

3. Accessories

The following panels and toolbar options are available from the sampling view:



Sampling toolbar.
A sampling toolbar is available at the top of each sampling view.
This toolbar includes buttons labeled Process, Thread, Module, Hotspot, and Source.
Select items within a view and click one of the buttons to drill down.



Tabbed windows.
W
hen you open a specific

sampling view
, a tab is created at the
bottom of the window labeled with the name of the view, for example, Process,
Thread, Module or Hotspot. If you open several views, a tab for each open view is
created at the bottom of the wind
ow. You can use the tabs to quickly move from one
view to another.



Microsoft


Excel.

Display your sampling data in a Microsoft Excel 2000 spreadsheet.
You can customize the appearance of the spreadsheet report as needed.

Page
10

of
24




Selection Summary panel
. View/hide
a panel displaying the events configured in an
Activity and the number of samples collected per event for the items you select in a
view.



Legend
. Display a detailed legend for all sampling views. Each Activity result, event,
and event ratio is color
-
coded.

The legend explains what each color represents.



Event summary panel.

Display the total number of events collected for items you
select in a view.



Multi
-
processor
. Display the workload as distributed across multiple processors.

SAMPLING

OVER TIME

1.

The Over
time view displays the samples collected for single event.


2.

It

enables you to identify which thread are running serially and in parallel at any
point of time.

3.

Sampling Over Time view can be invoked for Thread,Process and Module views.

4.

Sampling over time view
consists of two panels. The left panel displays the names of
the selected items and the right panel displays the samples collected over time.

The
right panel is divided into squares, each square representing a unit of time in
second
s.

5.

The color of the squares indicates the number of samples collected for that unit of
time. A red square indicates a large number of samples, and a green square indicates
a small number of samples.


FIGURE
2
:
Sampling Over Time

Page
11

of
24


The Over Time view can be

used
to gathe
r

the following

information:



Context Switching:

One can determine if there is excessive context switching.




Processor Utilization:
Enables you to view whether processor is idle or not.
If sytem
process receives samples there is scope for
improving processor utilization at that
time.



Temporal loction of hotspots:

We can see the specific periods of time when a large
number of events occurs.



Thread Interaction:

You can view the pattern of thread behavior and thread interaction.



Viewing the fo
otprint of each thread:

You can view the footprint of each thread on
Hyper
-
Threading technology enabled processors.




















Page
12

of
24



Call Graphs

INTRODUCTION

The call graph collector of the VTune(TM) Performance Analyzer collects information
about
the program flow of an application, that is, the number of function calls to some other
function and the amount of time each function spent executing its code and/or calling other
functions.


A function can be a
-

1.

CALLER


A parent function that calls

the current function.

2.

CALLEE


A child function that is called by the current function.

In many cases, the caller may call the callee from several places (sites), so call graph also
provides call information per site.

FEATURES OF CALL GRAPH

The following
are the main features of the call graph collector and views:

1.

Collection



Manual launching mode.

Manually launch
your application from the desktop and
select required modules of interest to analyze.



DLL
-
Level Data Collection
.

Configure the call graph collector to instrument and
analyze first
-
level DLLs even when the application itself cannot be instrumented.



Instrumentation filtering.

Select exactly which

functions to instrument
, improving
the speed of the instrumented applica
tion by using improved filtering capabilities.



Multi
-
thread, multi
-
process.

Collect data for more than one process with fully
automated threading

and fiber

support.



COM Tracing.

Profile COM interface methods using the call graph collector.

2.

Views

After you
collect call graph data using the VTune analyzer, you can view the call graph
profiling information at the following levels:



GRAPH
:

P
rovides visual graphical presentation of the application execution.

It
displays the selected function(s), the function's pa
rents (callers), its child functions
(callees), and timing information.

Each
node (box)

in the graph represents a function.
Each
edge
(line with an arrow)

connecting two nodes represents the call from the

parent to the child function. For every function
you can traverse caller and callee
functions.

Page
13

of
24


The call graph view uses the following conventions:



Nodes connected by thick red edges designate functions on the critical path from the
root (thread).



The thicker the edge, the greater the Edge time.

Uses of t
his view:



estimate the performance of your application



find potential performance bottlenecks



traverse the

critical path
, which is a path with the maximum

Edge time.



FIGURE
3
:
Graph view of Call Graph



CALL LIST

: P
rovides full information on the selected
or focus
function, its callers
(parents) and callees (children) in the table format.

The
focus function

is the function which is currently being viewed and the focus is on
that function. It shows the threads

and classes associated with it.

The
caller function

is the function which calls the focused function. Along there are
columns of contribution, Edge time, thread,class etc.

The
callee function

is the function which is been called by the focused function.
There
are also columns almost same as that of caller function.

Page
14

of
24




FIGURE
4
:
Call List view of Call Graph



FUNCTION SUMMARY
: P
rovides full information on all the application functions in
the table format.

The rows in the function summary display functions
with different
background colors according to the hierarchical position. The default view shows the
first four types of data as follows:




FIGURE
5
:
Function Summary view of Call Graph



Page
15

of
24


3.

Accessories

Following are the various options available from the

call graph view:



Filtering options.

Gain different perspectives on your data using the wide range of
filtering options available
.



Function detail.

Conveniently view detailed function information using tooltips

and
the

status bar
.



Unified Java

support.

Vie
w Java function calls and Win32

function calls in the same
call graph results.



Timing options.

View enriched
timing information
with an expanded collection of
wait times for functions and calls. Traverse Self Wait time, Total Wait time, Edge
time, Edge Wai
t time, and
Max path
from node to root and from node to bottom.



Node state indicators.

Adjust the color palette for any graph elements and control
node length settings to support long function names. Node state
indicators
highlight
three different types of

node status, facilitating orientation within the graph view.



Command access.

Control a wide range of options in the
function summary
view via
the function summary pop
-
up menu.
The
toolbar
contains enhanced features,
provides quick and easy access to the
most commonly used commands.




Multiple undo/redo.

Make changes to the way you view data, then return or
advance forward through several cycles of changes.















Page
16

of
24



Counter Monitor

INTRODUCTION

Counter Monitor identifies system
-
level issues in applications. It is used to track system
activities when the application runs on the system.

Counter Monitor collects data for specific performance counter data, such as that of an
application, an OS, or a

hardware device at different intervals of time.

The
counter monitor
collector
monitors and graphically displays the performance counter data.


Performance counter
is a feature that measure and gathers performance related data that
represents the state of
the system without affecting the performance of the program.


Counter monitor also helps you to understand the cause
-
and
-
effect relationship between
an application and the sytem on which the application is running.

If you develop application
specific
counters using performance dll’s ,the VTune analyzer will also monitor and display
these counter values.

FEATURES OF COUNTER MONITOR

The following are the main features of the counter monitor collector and views:

1.
Collection



Trigger mechanism
. Create
triggers
to monitor hardware and software counters at
predetermined intervals according to criteria that
is

set.

A
trigger

is an event that tells the VTune™ Performance Analyzer when to collect counter
data. The VTune analyzer uses the system timer as the
default trigger.

For the system timer,
performance data is collected once per second when the default interval (1000 milliseconds)
is used.

2.

Views

Following are the counter monitor views to help analyze the data:


Runtime Data
v
iew
.
During

runtime
, the
VTune analyzer generates a graph that
shows changes as they happen. View data as you log it or review data after the run.

This is the default

view which runs on completion of an activity.


Logged Data
v
iew
. Displays data logged during an Activity
.

In the Lo
gged Data view,
data from each counter selected for logging is charted with a separate line and color
.

Page
17

of
24


Each line on the chart represents data for a specific performance counter
. The peak
indicates the highest counter value. Moving the cursor over a counter

on the chart
displays a tool tip with the value of the counter at that point in time during data
collection.


FIGURE
6
:
Logged Data View of Counter Monitor

The peaks in each counter indicates the highest counter activity.

For example, a peak in the
counter that measures
Page Faults
per second indicates that the most page faults occurred
at that point in time during data collection.





Legend

view
.

Each line includes a distinct legend symbol for the corresponding counter,
representing the point at which data was taken. The vertical Y axis represents counter
values (scaled or actual), while the corresponding time is displayed on the horizontal X axis.


Page
18

of
24



S
ummary Data view
.

Displays a statistical view of the counter data.


The

Summary Data view provides statistical information for each counter you selected for
display in the Logged Data view. This information includes:



minimum value



maximum value



average value

This enables you to determine which values were the most active, or
otherwise interesting,
and drill down from a Logged Data view of those values.


FIGURE
7
:
Summary
Data View of Counter Monitor

The

summary data for each counter is represented as a bar diagram:






Page
19

of
24


where the
upper part

of the diagram is the maximum value for the counter (in the example:
% Total Processor Time counter), the
lower part

is the minimum value, and the
middle part

(violet bar in the example) is the average counter value.

3.

Accessories

Following are some opti
ons available from the counter monitor view:



Control charts
. Choose a chart style best suited to the data you w
ant to view using
the Chart FX
Properties.

WORKING OF COUNTER MONITOR

When
one
select an Activity with the counter monitor collector in the
Tuni
ng Browser

and
click Run Activity to begin performance data collection, the VTune analyzer does the
following:

1.

Launches the specified application, if any.

2.

Starts monitoring and logging the counter values.

The VTune analyzer collects performance data for al
l the counters of a performance
object but displays only the counters you select.

3.

Displays the
RunTime
d
ata
v
iew
with a chart showing the counter data as it is being
collected, if the runtime display option is selected.

4.

If sampling data collection was turned on, it also starts collecting
time based
or
event
based sampling
data.

5.

At the end of an Activity run, if counter monitor data was logged, the VTune analyzer
does the following:

o

Creates an Activity result with the coun
ter monitor data and shows it in the
Tuning Browser
.

o

Displays the counter monitor Logged Data view if the counter monitor data is
the only type of data that was collected, or prompts you to pick a view to
open if multiple types of data were collected.









Page
20

of
24


Tuning Assistant

INTRODUCTION

The Intel(R) Tuning Assistant provides advice on tuning your system resources and
application performance. Using its multiple knowledge bases, the Tuning Assistant analyzes
the data collected by the VTune(TM) Performance Analyzer, identifies performance is
sues,
and provides
insights
and tuning
advice
on the following types of data:



Sampling data collected on supported processors



Counter monitor data collected on supported operating systems.



C, C++, Fortran, or Java* source code



Disassembled assembly code

TUNING ASSISTANT CONCEPTS

The following are some key Tuning Assistant concepts:



Workload
.

All the software that was executing when data was collected.



Insight
.

An insight is an observation about the performance of your code. It indicates
a potential perfor
mance problem that could be a bottleneck to your application’s
performance.



Advice
.

Advice is a possible solution or recommended workaround (usually a
suggestion to modify the code) to remove or avoid a performance problem.



Relevance Score
.

A relevance score is a heuristic to indicate how relevant a
particular insight or advice is to the current context. For instance, an extremely high
relevance score for an insight may indicate a high probability of a performance
bottleneck.

The Tuning Assi
stant provides tuning advice for code, processes/modules/functions, or time
ranges that you select in source, sampling, or counter monitor views. If you provide symbol
information, the Tuning Assistant window provides links from your function names directl
y
to the corresponding code section in Source View.

FEATURES OF TUNING ASSISTANT

The Intel(R) Tuning Assistant has the following features to enable analyzing the performance
of your application:



Provides insights and advice on potential performance proble
ms by analyzing
sampling data collected on supported processors (See the Release Notes for a
complete list of processors for which the Tuning Assistant can provide insights and
advice). You can use the insights and advice to make algorithmic changes to you
r
application so the processor can execute your application more efficiently.

Page
21

of
24




Contains knowledge bases to support Hyper
-
Threading Technology.



Enables you to compare two or three Activity results.



Provides links from function names directly to the
corresponding code section in
source view when you provide symbol information



Provides advice on performance counter data and disassembly code



Provides static assembly advice.



Guides you through the key steps of performance tuning methodology



Provides the
ability to export the tuning advice report to a
.csv

(comma separated
values) text file for viewing and editing using a different application, such as
Microsoft* Excel.

UNDERSTANDING TUNING METHO
DOLOGY

1.

System
-
Level Tuning



The main objective

of system
-
lev
el tuning is to optimize the
utilization of system resources. The tuning speeds up application performance by
improving the way the application interacts
with the sytem.

This tuning is effective
for I/O applications.

2.

Application
-
Level Tuning
-

The main
purpose of application
-
level tuning is to reduce
the execution time of an application.

This can be achieved by improving the
algorithms of the applications, implementing threads, and by using APIs.

3.

Microarchitecture
-
Level Tuning
-

Increases the performance
of application by
improving the way an application runs on a processor.

This type of tuning is used
with processor
-
intensive applications.

STRATEGIES
F
O
R

IMPROVING PERFORMANCE OF APPLICATION



Balancing Input
-
Output
-

Enables to speed up application when proc
essor utilization
is low. Processor utilization drops when the processor is waiting for I/O to complete.
Need to make changes in app. during system level and application
-
level tuning.



Improving threading model
-

By adding multithreading to single
-
threaded
a
pp
.
Improve efficiency of app. by increasing processor utilization.



Improving the efficiency of computation
-

Speed up application by making changes
to the application to accomplish the same amount of work by using less
computation.


TYPES OF ADVICE



Sampling
-
based advice



Tuning assistant automatically analyzes the sampling
data,identifies performance issues, and provides insights on the issues.

When one
click an insight, the
More Information

window provides additional information.

This
window cont
ains
Relevance scale

that can be use to view the relevance of a
particular insight to performance issues.

Page
22

of
24



FIGURE
8
:
Advice Window

(Showing Sampling
-
based advice)

of Tuning

Assistant

Advice



Counter Monitor
-
based advice



Tuning assistant performs
counter analysis based
on all counters measured in activity.

After analysis, TA displays insights into potential
performance bottlenecks.



Source
-
based advice


TA uses a compiler technology for source
-
based advice, which
enables you to speed up the executi
on of code. But it is limited to C,C++ and Java
applications.



Static Assemble Penalties


VTune analyze code at assembly language level
.

The two
categories of information that TA displays are:

1.

Penalty


Indicates a specific problem and the effect of the pr
oblem on
performance of code.

2.

Warning


Indicates potential problems that might degrade the performance.

INFORMATION THAT TUNING ASSISTANT PROVIDES INCLUDES:



INSIGHTS



Indicates the problem that could be hindering the performance of the
application.

Various categories of insights are:
-

1.

Top
I
nsights


That are estimated to have significant impacts on performance.

Enables to identify the maximum optimization
that one can achieve for the
application.

2.

Workload Insights
-

Are performance issues for all mod
ules and
processes.
(See fig. 8)

3.

Module Insights


Focus on performance issues for the modules in
an
application.
(See fig. 8)

Page
23

of
24


4.

Hotspots Insights
-

Insights on performance issues

based on functions that are
sorted by percentage of CPU time.

5.

System Info
-

Summar
izes the features that the system uses such as sped of
processor and the name of the operating system.

6.

Static Analysis


View information about possible optimizations to improve
app. performance.




FIGURE
9
:
More
Information

Window of Tuning Assitant Advice



RELEVANCE SCALE



Indicates the relevance of the insight or advice to a particular
performance issue.

For example, a high relevance score indicates that the effect of
the problem on the application is significan
t

or 100%.
(See fig. 9)



TUNING ASSISTANT ADVICE
-

Possible solution

to remove or avoid a problem
.
One
can c
lick on links

as shown in fig. 8

to

get advice
.







Page
24

of
24


References

The

following are the references which have been used for documentation purpose.

©

Help
file



Intel VTune software help file is used.

©

Books
-


1.

Intel VTune Performance Analyzer Essentials (Author: James Reinders)

2.

3
rd

Semester Intel VTune (By NIIT)

©

Websites
-


1.

www.intel.com

2.

www.hiperism.com