Intel® Profiler for OpenCL* API · PDF file2 | P a g e Intel® Profiler for OpenCL* API...
Transcript of Intel® Profiler for OpenCL* API · PDF file2 | P a g e Intel® Profiler for OpenCL* API...
Intel® Profiler for OpenCL* API
Version 2.1 Beta
USER GUIDE EDITION 1.0
*other names and brands may be claimed as the property of others
1 | P a g e Intel® Profiler for OpenCL* API
Contents
Installation .......................................................................................................................... 2
Licensing ............................................................................................................................. 8
Tool Usage ........................................................................................................................ 12
Profiling OpenCL* Applications ....................................................................................... 14
Kernel Statistics ............................................................................................................... 18
GPU/CPU Utilization Tab .................................................................................................. 21
Overview Tab: .............................................................................................................. 22
Comparison Tab: ........................................................................................................... 23
Time Line Tab: .............................................................................................................. 24
Memory Performance Tab............................................................................................... 26
Overview Tab: .............................................................................................................. 26
API Data Tab: ................................................................................................................ 27
Hardware Counters (Experimental) ................................................................................ 28
Sessions ............................................................................................................................ 29
Help - About ..................................................................................................................... 30
2 | P a g e Intel® Profiler for OpenCL* API
Installation 1. Upon downloading the installation package, the package comes with a user guide,
readme and an installer (setup)
Double click on Setup to launch the installer
2. The installer checks if Java and MS VC++ redistributables are available on the
system. If these components are not available, installer installs these components
before the profiler is installed.
3 | P a g e Intel® Profiler for OpenCL* API
3. Java installation:
Download JRE
Hit “Install” to install Java. Profiler tool will not work without JRE. No need to change
the destination folder. Let Java get installed in the default folder.
4 | P a g e Intel® Profiler for OpenCL* API
Click “Close” after successful installation of Java.
4. Install VC++ redistributables.
VC++ redistributables are installed if not already installed.
5 | P a g e Intel® Profiler for OpenCL* API
5. Install Profiler For OCL
Click Next to continue installation
nd
6 | P a g e Intel® Profiler for OpenCL* API
Please review the license agreement completely. If the license is acceptable, click the
“I accept the terms in the license agreement” radio button and click “Next”
Click “Install” to continue to installation
7 | P a g e Intel® Profiler for OpenCL* API
There may be couple of UAC popups. Hit “Okay” or “Yes” on them to allow for
installation to continue.
Hit “Finish” to complete installation.
Upon complete installation, a shortcut is placed in the start menu to launch the
Profiler tool.
A shortcut for the tool can also be found on the desktop.
8 | P a g e Intel® Profiler for OpenCL* API
Licensing
Upon launching the GUI, a splash screen comes up, displaying the version of the tool.
The use of the Intel Profiler for OpenCL* is licensed for a limited duration. The License
Checker is then launched to determine if a valid license exists for this tool. If a valid license
is found, the GUI automatically validates the license and starts up.
If a license is not found, the popup box below for License Configuration is launched.
9 | P a g e Intel® Profiler for OpenCL* API
The license is attached to a single specific host computer. To get this file, you need to send
an e-mail to <[email protected]> including your computer's COMPOSITEID
COMPOSITEID obtained by clicking on the Machine ID button
If there is an email client installed on the system, click on the “Email” button. The License
configuration window launches the client with the necessary information to be sent to
10 | P a g e Intel® Profiler for OpenCL* API
If the email client cannot be launched, a message with the information needed to be sent to
Intel pops up. Please copy-paste this information and email to [email protected].
Within a couple of hours, a license file will be issued to authenticate the installation.
Place the license file on the system that you want to run the tool and obtained the license
for, hit the “Browse” button and choose the license file.
11 | P a g e Intel® Profiler for OpenCL* API
As soon as authentication is successful, a pop up confirming registration is displayed.
NOTE: Please DONOT delete the license file or move it to a different location. Every time tool
is launched, a check for license is done, if the license file is deleted; send in another request
for license. If the license file is moved, go back to the license checker and point to the new
location of the license file. The license checker can be launched from the top menu –
Settings -> License Configuration
12 | P a g e Intel® Profiler for OpenCL* API
Tool Usage
Upon launch of the tool, the tool displays the “Device Information” of all OpenCL* compliant
devices in the platform. The device information tab displays:
1. Device capabilities
2. Extensions supported
3. OpenCL* version supported
4. Driver version
5. Local/Global Memory
6. Compute units available
7. Work group size supported by each device
There is one tab per OpenCL* compliant device on the platform.
13 | P a g e Intel® Profiler for OpenCL* API
Intel CPU OpenCL* device properties:
Intel GPU OpenCL* device properties:
14 | P a g e Intel® Profiler for OpenCL* API
Profiling OpenCL* Applications
From “Action” pull down menu, select “New”.
This action creates a new session for profiling.
Click “Browse” button.
A popup window prompting to choose the OpenCL* application comes up.
Select the directory where the OpenCL* application (.exe) to be profiled exists. Click “Open”.
15 | P a g e Intel® Profiler for OpenCL* API
The selected application will appear in the OpenCL* App bar.
If the application has any parameters to run, enter those in the “Workload Arguments” text
box.
To dump detailed API level information, a parameter called “-dumpstream” can be entered in
the “Parameters” text box.
Click “Profile” button.
Note: Dumpstream dumps detailed API level information about all the OpenCL* kernels in the
application being profiled. These metrics include:
1. Dump of all OpenCL* API calls made in the application
2. Dump of Kernel code if the application is using clCreateProgramWithSource API call
16 | P a g e Intel® Profiler for OpenCL* API
Hit “Profile” button to start profiling.
The application being profiled is launched and the tool collects all the necessary profiling
data.
Ensure that the words "Profiling Completed" can be seen in the console at the bottom of the
screen once profiling is completed.
Notice that next to the Application Tab, there are now three additional tabs:
Kernel Statistics.
GPU/CPU Utilization.
Memory Performance.
17 | P a g e Intel® Profiler for OpenCL* API
Profiling data dumped is organized to these three tabs. Each of the three tabs is discussed in
detail in the following sections.
A list of all the applications profiled in a particular session is listed in the “Apps Profiled” right
hand navigation menu.
User can navigate between profiling data dumped for one OpenCL* application to another.
User can also import profiling data from another platform to the current session and
compare.
18 | P a g e Intel® Profiler for OpenCL* API
Kernel Statistics This tab displays information relevant to all the OpenCL* Kernels in the OpenCL* application
being profiled.
A summary on the complexity of the application being profiled is displayed in this tab:
1. Total number of OpenCL* API calls made in the application.
2. Total number of OpenCL* Kernels executed in the application.
3. Total number of clEnqueueNDRangeKernel calls made in the application.
A summary about all the OpenCL* Kernels executed in the OpenCL* application along with
details about the number of arguments for each of the kernels, work dimension, global/local
workgroup size and an average time taken for this kernel to execute on the hardware is
displayed.
19 | P a g e Intel® Profiler for OpenCL* API
A comprehensive list of all the OpenCL* APIs executed in the OpenCL* application being
profiled is displayed along all the parameters and return value is displayed in the API File tab.
20 | P a g e Intel® Profiler for OpenCL* API
The source code for all the OpenCL* Kernels executed in the OpenCL* application being
profiled is also displayed.
Note: The Kernel source code is displayed only if clCreateProgramWithSource is used in the
OpenCL* application.
Note: The source code is visual studio syntax highlighted.
21 | P a g e Intel® Profiler for OpenCL* API
GPU/CPU Utilization Tab
Click the GPU/CPU Tab. Notice the three feature tabs labeled
Overview
Comparison
Time Line
22 | P a g e Intel® Profiler for OpenCL* API
Overview Tab: The Overview Tab displays an overview of all the hardware execution times for all the
OpenCL* Kernels in the OpenCL* application being profiled. The checkboxes on the right can
be used to pick which OpenCL* Kernels are studied in detail.
Comparison bar graphs for every OpenCL* Kernel vs entire OpenCL* application can be
dumped.
23 | P a g e Intel® Profiler for OpenCL* API
A pie chart displaying the amount of time the OpenCL* application spent in execution of
OpenCL* Kernels, building the OpenCL* Kernels is also displayed.
All of the charts and graphs from the GUI can be saved as images so they can be pulled in to
reports and documents.
Comparison Tab: The Comparison tab enables the user to import profiling data from different runs or from
different platforms and compare the OpenCL* Kernel execution times.
24 | P a g e Intel® Profiler for OpenCL* API
Time Line Tab: The Timeline view tab enables the user to see the entire application on a timeline view.
With this view, users can comprehend what activities happened at what time in the
application. The radio buttons at the bottom lets the user pick which kernels they want to
study in detail.
The rectangular bars denote the instances of the Kernel. The circles denote the time at
which the Kernels were created ( clCreateKernel call )
Mouse over on the circles show more information about the creation of the kernels.
25 | P a g e Intel® Profiler for OpenCL* API
Mouse over on the rectangular bars gives more details about the instance of the kernel
including the instance number, time taken for the kernel to execute on hardware,
global/local work size, work dimension etc.
Right click on timeline graph, choose “Options” -> “Show Y Axis” to display the kernel names
next to the graph.
26 | P a g e Intel® Profiler for OpenCL* API
Memory Performance Tab
This tab displays all the memory events happening in the application. The data displayed in
this tab is useful to study bandwidth constrains and latencies.
Overview Tab: The overview tab displays the kind of memory access calls made and how many of them are
made.
Summary of all OpenCL* memory calls, the amount of memory transferred, the time to
transfer and the bandwidth is displayed.
Pie charts of the distribution of the different types of memory operations – time and size is
also displayed.
27 | P a g e Intel® Profiler for OpenCL* API
API Data Tab: Displays the type of memory transfer operations, size and the address of the memory
buffers that is touched in the entire application.
A comprehensive list of all the buffer and image operations are also provided.
28 | P a g e Intel® Profiler for OpenCL* API
Hardware Counters (Experimental)
Hardware counters enables users to see the utilization of execution units in the GPU during
the execution of a kernel. To enable enable hardware counters a parameter called “-
kernelcounters” can be entered in the “Parameters” text box. This parameter enables
hardware counters for all the kernels in the application. Advanced options are available to
fine tune the metrics.
-kernelcounters=<options>
Options :-interval=<interval_value in micro secs>, default value 1000 micro secs
-kernel=<kernel1,kernel2>, To view counters for specific kernels.
Interval controls the granularity of sampling. There can be performance impacts if the
interval is too low.
The hardware counters are embedded in the TimeLine view. To view the hardware
counters for a kernel, double click the rectangular instance corresponding to a kernel.
The chart below shows Execution unit (EU) utilization and the frequency ratio for
graphics core.
29 | P a g e Intel® Profiler for OpenCL* API
Sessions
Multiple profiling runs of the same application or different applications can be saved as a
session using the “Save Session” menu from the “Action” menu item. Sessions created at a
different time on the same platform or a different platform can be imported using the “Load
Session” menu from the “Action” menu item.
The Comparison tab on the GPU/CPU Utilization tab can be used to compare between
multiple sessions