Create Smarter Code Smarter with Intel® System Studio 2016...Fortran, C#, Python and more ‒...

44
Create Smarter Code Smarter with Intel® System Studio 2016 Naveen GV Software & Services group 1

Transcript of Create Smarter Code Smarter with Intel® System Studio 2016...Fortran, C#, Python and more ‒...

Create Smarter Code Smarter with Intel® System Studio 2016

Naveen GVSoftware & Services group

1

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

2

Smart, Connected Devices are Growing in Complexity and are EverywhereIncreasing the Challenges for System and Embedded Developers

To address these challenges, software developers need tools that… Are comprehensive and easy to use Quickly help resolve defects in complex systems Offer insight into sources of excess power consumption Enable and accelerate performance-demanding use cases

Networks &CommunicationTransportation MedicalIndustrial

Military, Aerospace,

GovernmentRetail

$$

ImagingDigital

SecuritySurveillance

Client & Mobile

Cloud /data centers /

storageIoT Devices

F

143 bpm

Gateways

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice3

Deep system-wide insight for SYSTEM and embedded developers

Accelerate Time to Market

Strengthen System Reliability

Boost Power Efficiency and Performance

Create smarter code — smarter

Intel® System Studio

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice4

Debug & Trace

Intel® System Debugger

UEFI, OS, driversthrough JTAG

System Software

Intel® Debug Extensions for

WinDbg*

Windows* stackWinDbg* over JTAG

Build & Optimize

Intel® C++

Compilerincl. Intel® Graphics

Technology offload

Intel® Integrated

Performance Primitives

Intel® Math Kernel Library

Intel® Threading Building Blocks

Eclipse*-based,Visual Studio*

Intel-enhanced GDB*

IDE support

Systems, Embedded Applications

Intel® Energy Profiler

Intel® Frame Analyzer

Intel® Platform Analyzer

Intel® System Analyzer

Intel® Inspector

Intel® VTune™ Amplifier

Analyze

CPU/GPU workloads

In real-time

Code performanceon CPUtime-, event-based

System-wide power efficiencyWake-up, sleep-state, frequency, temp.

Graphics performanceOpenGL ES, DirectX

Application robustness memory leaks

Performance

Power

Correctness

CPU/GPU workloads

offline and detailed

Composer Edition

Professional Edition

Ultimate Edition

What’s Included in Intel® System Studio

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® System Studio 2016 <Edition> for Linux*

5

Host OSDeveloper’s PC

IDE SupportTools integrating into

Target OSEmbedded System

Eclipse*Wind River* Workbench*

Microsoft* Visual Studio*

Intel® System Studio 2016 <Edition> for Windows*

Including:

OS, IDE Support

Intel® System Studio 2016 for FreeBSD*

Intel® C++ Compiler installs on target

system

Intel® VTune™ Amplifier installs on

Windows* Linux*, OS X*host

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice6

Build performance optimized code

Intel® System Studio - Composer Edition

• Great code performance• Application remote debug for robust

code• Libraries for performance demanding

code routines• Unified threading methodology across

target OS platforms• Integrates into common IDEs

Intel® Integrated Performance

Primitives

Intel®-enhanced GDB debugger

Eclipse* IDE,Workbench*, Visual Studio*

Intel® C++ Compiler

IDE support• Eclipse, Workbench for Linux* target OS• Visual Studio for Windows* target OS

IA-optimized CompilerIncl. Intel® Graphics Technology offload

IA-optimized libraries• Image, signal, data processing

Application DebuggerLinux*, Android*

Build & Optimize

Systems, Embedded Applications

Intel® Math Kernel Library

• 1D, 2D, 3D FFT, and others

Intel® Threading Building Blocks

Threading libraryUnified templates for Windows, Linux targets

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Enable and Optimize Compelling System and Application UsagesHighly Optimized Compilers and Libraries

7

For tests and configurations refer to backup slides

Performance gain for embedded applications for Windows*• Intel® C++ Compiler for Windows* VS. Microsoft* Compiler

Performance gain for embedded applications for Linux*/Android*• Intel® C++ Compiler for Linux*/Android* VS. GCC*

Performance gain for demanding image, signal, data processing• Intel® C++ Compiler – Code Offload to Intel® Graphics Technology• Intel® Integrated Performance Primitives• Intel® Math Kernel Library

2x

Up to

1.5x

4x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® Integrated Performance Primitives (Intel® IPP)

Enhances Developer Productivity

Optimized image, signal and data processing routines

Thread-safe functions

Industry Leading Performance

Instruction set-level optimizations (SIMD)

Efficient parallelism on multicore platforms

Support for Latest Processor Architectures

Optimized for current multi-core processors

Applications benefit seamlessly

8

Cross Platform and Operating System Support

Multi OS:

• Windows*

• Linux*

• OSX*

• Android*

•VxWorks*

Multi Platform:

•Mobile and Embedded (Intel® Quark, Intel® Atom™)

•Tablet (Intel® Atom™, Intel® Core™)

•Ultrabook/PC (Intel® Core™)

•Servers and Workstations (Intel® Xeon® and Intel® Xeon® Phi™)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Additional optimization for Intel® Quark™, Intel®

Atom™, and the processors with Intel® AVX2

instructions support

Intel® Quark™: data compression, cryptography

optimization

Intel® Atom™: computation vision, image processing

optimization

Intel® AVX2: computer vision, image processing

optimization

New APIs to support external threading

Improved CPU dispatcher

Auto-initialization. No need for the CPU initialization

call in static libraries.

Code dispatching based on CPU features

Optimized cryptography functions to support

SM2/SM3/SM4 algorithm

Custom dynamic library building tool

New APIs to support external memory allocation

Intel® Integrated Performance Primitives

9

What‘s New

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

‒ Speeds math processing in scientific, engineering and financial applications

‒ Functionality for dense and sparse linear algebra (BLAS, LAPACK, PARDISO), FFTs, vector math, summary statistics and more

‒ Provides scientific programmers and domain scientists

‒ Interfaces to de-facto standard APIs from C++, Fortran, C#, Python and more

‒ Support for Linux*, Windows* and OS X* operating systems

‒ Extract great performance with minimal effort

‒ Unleash the performance of Intel® Core, Intel® Xeon product families

‒ Optimized for single core vectorization and cache utilization

‒ Coupled with automatic OpenMP*-based parallelism for multi-core and manycore

Batch GEMM functions

– Improve the performance of multiple, simultaneous matrix multiply operations

– Provides grouping (the same sizes and leading dimensions) and batching across groups

Sparse BLAS inspector-executor API

– Matrix structure analysis brings performance benefit for relevant applications (i.e. iterative solvers)

– Parallel triangular solver

– Both 0-based and 1-based indexing, row-major and column-major ordering

GEMMT functions calculate C = A * S * AT, where S is symmetric and/or diagonal

Counter-based pseudorandom number generators

– ARS-5 based on the Intel AES-NI instruction set

– Philox4x32-10

Intel® Math Kernel Library (Intel® MKL)

10

What‘s New

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® Threading Building Blocks (Intel® TBB)

Specify tasks instead of manipulating threads

Intel® TBB maps your logical tasks onto threads with full support for nested parallelism

Targets threading for scalable performance

Uses proven , efficient parallel patterns

Uses work stealing to support the load balance of unknown execution time for tasks

Flow graph feature allows developers to easily express dependency and data flow graphs

Has high level parallel algorithms and concurrent containers and low level building blocks like scalable memory allocator , locks and atomic operations.

Open-sourced and license versions available on Linux, Windows, Mac OSX, Android

Commercial support for Intel® Atom™, Core™, Xeon® processors, and for Intel® Xeon Phi™ coprocessors

11

What‘s New

Fully supported tbb::task_arena

• Task arenas provide improved control over workload isolation and the degree of concurrency.

Binary files for 64-bit Android* applications were added as part of the Linux* package.

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

What our customers say…

2xProductivity Increase

“The Intel® C++ Compiler, as part of Intel® System Studio for FreeBSD*, is nearly a drop-in replacement for Clang and GCC. Working with a code base of seven million lines, built with Clang and GCC, the effort to integrate Intel System Studio for FreeBSD* took only about three days. This was less than half as long as expected.”

DellEric van Gyzen, Senior Software Development Engineer

5o% BetterPerformance

”By using Intel System Studio, we could improve the performance of our Intel® architecture-based network video recorder systems by 50%”

Zhejiang Dahua Technology CoCai Jian Feng, Product Line Manager

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice13

Code performance on CPUtime-, event-based

Intel® Energy Profiler

Intel® Frame Analyzer

Intel® Platform Analyzer

Intel® System Analyzer

Intel® Inspector

Intel® VTune™ Amplifier

Analyze

CPU/GPU workloads• in real-time

System-wide power efficiencyWake-up, sleep-state, frequency, temp.

Graphics performanceOpenGL ES, DirectX

Application robustness memory leaks

Performance

Power

Correctness

• offline and detailed

• Workload analysis to understand system behavior

• Code analysis for more responsive systems• Frame analysis for fast graphics• System-wide analysis to optimize energy

efficiency• Threading and memory leak analysis to

improve system robustness

Analyze performance, power efficiencyand code correctness

Intel® System Studio - Professional EditionIncludes Composer Edition

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Get the Data You Need

Hotspot (Statistical call tree), Call counts (Statistical)

Thread Profiling – Concurrency and Lock & Waits Analysis

Cache miss, Bandwidth analysis…

GPU Offload and OpenCL™ Kernel Tracing

Find Answers Fast

View Results on the Source / Assembly

Graphical Frame Analysis

Filter Out Extraneous Data – Organize Data with Viewpoints

Visualize Thread & Task Activity on the Timeline

Easy to Use

No Special Compiles – C, C++, C#, Java, ASM

Visual Studio* Integration or Stand Alone

Graphical Interface & Command Line

Local & Remote Data Collection

Analyze Windows* , Linux* & Android *data

Intel® VTune™ AmplifierFaster, Scalable Code Faster Quickly Find Tuning Opportunities

See Results On The Source Code

Visualize & Filter Data

14

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice15

Intel® VTune Amplifier

Analyze Application Performance on Preemptive Real-Time Linux*

Quickly and accurately pinpoint performance hotspots in preemptive Linux* systems

o Data collectors can be interrupted any time by high-priority tasks, precise performance profiling is a challenge in preemptive systems

Provides concurrency, waits and locks analysis and context switch information

Intel® VTune™data collector

Embedded Real-time Applications

PreemptiveRT Linux*

What‘s New

Analyze Application Performance in Virtualized Environments

Observe and analyze performance behavior of embedded applications running on guest OS instances

Performance optimize multiple OSes and applications in virtualized environment on a single platform to save hardware cost

VM1 1

Guest OS 1

Embedded Application

VM2 1

Guest OS 2

Embedded Application

VM3 1

Guest OS 3

Embedded Application

Hypervisor

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice16

Intel® Energy ProfilerQuickly Identify Software That is Wasting Power

Android*, Windows* and nowLinux* support

Extend battery life of IoT, mobile and embedded devices running Linux*

Optimize fan-less systems thermals

For more details on processor and platform support please visit: https://software.intel.com/en-us/intel-energy-profiler

The lower the core frequency over time, the better the power

efficiency

C6 deep sleep mode causes less power

consumption

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® Inspector - Memory & Thread Debugger

Memory usage graph plots memory growth

Select a cause of memory growth

See the code snippet & call stack

Sp

ee

d d

iag

no

sis

of

dif

ficu

lt

to f

ind

he

ap

err

ors

17

Debugger BreakpointsDiagnosing Some Errors Can Take Months

Races & deadlocks not easily reproduced

Memory errors can be hard to find without a tool

Debugger Integration Speeds Diagnosis

Breakpoint set just before the problem

Examine variables & threads with the debuggerDia

gn

ose

in

ho

urs

in

ste

ad

of

mo

nth

s

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice18

What our customers say…

5 minutes vs. 8+ hoursProductivity Increase

“IMCORP pioneers complex signal processing algorithms for power transmission cable diagnostics. Intel® VTune™ Amplifier, as part of Intel® System Studio, allowed us to find critical performance hotspots within 5 minutes that otherwise would take us more than 8 hours.”

IMCORP R&D Software Engineer

3X BetterPower Efficiency

”Intel System Studio drastically improved the user experience of our recently launched Android*-based tablet, Tolino Tab* 8” (optimized for eReading)—by a factor of 3x (200ms vs. 500-700ms)—which reduced the CPU workload and the resulting power consumption by at least the same factor.”

Deutsche Telekom , Dirk Hofmann Chief Product Owner

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice19

Intel® System Debugger

System Software

System-wide debug and trace for more robustness

Debug & Trace

UEFI, OS, driver debug & tracethrough JTAGLinux*, Android* target OS

• Holistic system-wide debug and trace• For UEFI, OS, drivers, middleware• Identify tricky bugs faster through event

tracing• Supports a variety of JTAG hardware

interfaces• OS awareness for Linux*, Android*,

VxWorks* for more efficient debug cycles• Full-stack debug for Windows* integrators

Intel® System Studio - Ultimate EditionIncludes Professional Edition

Intel® Debug Extensions for

WinDbg*

Windows* stackWinDbg* over JTAGWindows* target OS

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice20

Flexibility – alleviates requirement for an accessible hardware JTAG port

Low-cost – debug over standard USB connection instead of expensive JTAG probe

Intel® System DebuggerJTAG-based Debug and Trace over Low-cost USB Connection

Debug & trace from CPU reset

Intel® SVT Closed Chassis Adapter (1)

Debug & trace OS boot

USB cableIntel® System

DebuggerTarget System

(1) SVT = Silicon View Technology – more details: https://designintools.intel.com/product_p/itpxdpsvt.htm

JTAG data over physical USB port

Target System

Intel® SystemDebugger

Available with 6th generation Intel® Core™ processor family (formerly code-named Skylake)

What‘s New

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice21

Intel® Debug Extensions for WinDbg*System Debug and Trace Extensions for Microsoft* WinDbg* Kernel Debugger

Simplify platform bring-up and Windows* driver validation now available with Microsoft* WinDbg* over JTAG

Debug a completely halted Windows* system including drivers and interrupts

Isolate complex run-time issues faster with Intel® Processor Trace Hardware

Firmware

JTAG and Intel® Processor Trace enhanced Microsoft* WinDbg* kernel debugger

Intel® Processor Trace information

Available with 6th generation Intel® Core™ processor family (formerly code-named Skylake)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Graphics Cores

Compiler generated code offloaded

CPUCores

Intel® Debugger for Heterogeneous Compute 2016Effectively Debug Compute Intensive Code Offloaded to Graphics Cores

22

Cooperatively execute compute intensive code across processor and graphics cores

Use simple compiler directives (#pragma) to mark code for offload

Debugger now available to debug code running on graphics cores

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Intel® Core™ Processors and Intel®

Xeon® Processors with Intel® HD or

Intel® Iris™ Pro Graphics

Debug client

Source code of application that

executes on graphics cores

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

What our customers say…

Code Improvement

“Intel® System Debugger, as part of Intel® System Studio, enabled us to improve sensitive, hardware-dependent code in our industrial automation system software. It helped us to drastically reduce engineering efforts when analyzing processor internal states and execution of time-critical paths in our software.”

Dr. Henning Zabel, Beckhoff Automation

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice24

Support Newest PlatformsAdded Support for New Intel Processors and Target Operating Systems

Support for recently launched versions of Intel® processors

o Intel® Atom™ x3 processorsformerly code-named SoFIA

o Intel® Atom™ x5, x7 processorsformerly code-named Cherry Trail

o 6th Generation Intel® Core™ processors formerly code-named Skylake

Microsoft* Windows* 10

FreeBSD*

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Expanded New

Expanded Expanded

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice25

Enhanced Developer ProductivityImproved Out-of-the-Box Experience, IDE and Samples Included

Enhanced out-of-the-box experience

o Get started without actual target hardware using Wind River* Simics* platform simulation

Eclipse* IDE included

o Improved tools integration

More samples for a quicker start

Enhanced documentation

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice26

Intel® System Studio 2016 SummaryDeep System-wide Insight for System and Embedded Developers

Increases performance with expertly optimized compiler and libraries

Enhances power efficiency and performance with enhanced analyzers

Eases isolation of complex defects with new debug and trace capabilties

Extends support to the newest Intel platforms and operating systems

Improves developer productivity with expanded usability and capabilities

Create smarter code — smarter, with Intel System StudioLearn more at: http://intel.ly/system-studio

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Educating with Webinar series about “2016” toolsExpert talks about the new features

Series of live webinars Sept 22nd – Oct 21st , 2015

Attend live, or watch after the fact.

http://tinyurl.com/webinars-intel2016

27

Topics Time

(Asia) - Tuesdays

Time (USA/Europe)

– Wednesdays

Create Smarter Code Smarter with Intel® System Studio 2016 Sep.22nd,11pm (PST) Sep.23rd 9AM (PST)

Analyzing Performance Bottlenecks on Intel® Architecture Oct. 6th , 11PM (PST) Oct. 7th , 9AM (PST)

Migrating embedded applications to Intel x86 Architecture Oct 13th , 11 PM (PST) Oct.14th , 9AM (PST)

Get Deep System Insight for 6th Generation Intel® Core™ Processors Oct 20th, 11PM (PST) Oct 21st , 9AM (PST)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice28

Legal Disclaimer & Optimization Notice

INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Copyright © 2015, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

28

30

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice31

Intel® System Studio: Editions, Components, and Operating Systems

Target Operating Systems Linux* 1, 2 Android* 2 Windows* VxWorks* 3 FreeBSD*

Category Component

Co

mp

ose

r E

dit

ion

Pro

fess

ion

al

Ed

itio

n

Ult

ima

te

Ed

itio

n

Co

mp

ose

r E

dit

ion

Pro

fess

ion

al

Ed

itio

n

Ult

ima

te

Ed

itio

n

Co

mp

ose

r E

dit

ion

Pro

fess

ion

al

Ed

itio

n

Ult

ima

te

Ed

itio

n

VxW

ork

s*

Ed

itio

n

Fre

eB

SD

E

dit

ion

Host Operating SystemsLinux*

Windows*Linux*

Windows*Windows*

Linux*Windows*

Linux*FreeBSD*

Integrated Development EnvironmentEclipse*,

Workbench*Eclipse* Visual Studio* Workbench* Eclipse*

Compiler & Libraries

Intel® C++ Compiler

Intel® Integrated Performance Primitives

Intel® Math Kernel Library

Intel® Threading Building Blocks

System & Application Debuggers

Intel® System Debugger 4 7

Intel® Debug Extensions for WinDbg* 4

Intel®-enhanced GDB* Application Debugger

Intel® Debugger for Heterogeneous Compute

Performance, Power &

Correctness Analyzers

Intel® VTune™ Amplifier 6

Intel® Energy Profiler

Intel® Inspector

System Analyzer

Platform Analyzer 5

Frame Analyzer 5

1 Linux*, Embedded Linux, Wind River* Linux*, Yocto Project*2 Linux* and Android* target support available in a single product3 Available from Wind River* with VxWorks*

4 Via Intel® ITP-XDP3 probe, OpenOCD*, Intel® SVT Closed Chassis Adapter* and EDKII* for UEFI*5 Available for Windows* host6 Also available for OS X* host as a separate download7 Intel® System Debugger provides VxWorks* OS awareness – available with Ultimate Editions

NewNew

New

New

intel® System StudioBenchmarks

32

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice33

Compiler options

Intel System Studio XE 2016: -O3 -ipo -xATOM_SSE4.2 -ansi-alias -prec-div- -staticGCC 5.1: -m32 -Ofast -mfpmath=sse -flto -march=native -funroll-loops -ffat-lto-objects (-m64 for Coremark Intel64)

Hardware configurations

Intel(R) Atom(TM) CPU C2750 @ 2.41GHz, 32 GB RAMRed Hat Enterprise Linux Server release 7.0 (Maipo), kernel 3.10.0-123.el7.x86_64

Benchmarks

EEMBC sources have been taken from common repository for GCC, LLVM , IC teamsMetric is Iterations per second, scaled according to EEMBC publishing requirements (http://eembc.org/benchmark/index.php)

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

100% 100% 100% 100% 100% 100%

132%

208%

156%

127%

104%

141%

0%

50%

100%

150%

200%

250%

AutoBench 1.1

Geomean

TeleBench 1.1

Geomean

DenBench 2.0

Geomean

IpMark

Geomean

TCPMark

Geomean

EEMBC

Geomean

Pe

rfo

rma

nce

ga

in (

hig

he

r is

be

tte

r)

AutoBench 1.1, TeleBench 1.1, DenBench 2.0, IpMark, and

TCPMark Benchmarks (EEMBC) - Best Option Set

GCC 5.1 Intel System Studio 2016

Intel® C++ Compiler Performance on EEMBC* Benchmarks

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® C++ Compiler Benchmarks on Windows* Targets Estimated Performance Difference

34

CompilersIntel® C++ Compiler for IA-32 applications, Version 16.0 Build 20150423Intel® C++ Intel(R) 64 Compiler for Intel(R) 64 applications, Version 16.0 Build 20150423Microsoft* C/C++ Optimizing Compiler Version 18.00.21005.1 for x86Microsoft* C/C++ Optimizing Compiler Version 18.00.21005.1 for x64

Platform Microsoft* Windows 8.1 EnterpriseHardware Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz, HyperThreading is offRAM 16GBHDD 1TB

BenchmarksCINT2006 geomeanCFP2006 C/C++ geomeanSPEC2006 C/C++ geomean

NOTE: 32-bit compilers for CINT2006 in RATE mode were used, as in SPEC publications

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance test, such as CINT2006*, CFP2006 C/C++*, SPEC2006 C/C++*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Benchmark Source: Intel Corporation. For more complete information about compiler optimizations, see our Optimization Notice.

100% 100% 100%

151%

129%143%

0%

20%

40%

60%

80%

100%

120%

140%

160%

CINT2006 Geomean CFP2006 C/C++

Geomean

SPEC2006 C/C++

Geomean

Pe

rfo

rma

nce

Ga

in (

Hig

he

r is

be

tte

r)

CINT2006, CFP2006 C/C++, SPEC2006 C/C++

RATE Benchmarks - Best Option Set

MS Visual Studio* 2013 Intel C++ Compiler 2016

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice35

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance test, such as CINT2006*, CFP2006 C/C++*, SPEC2006 C/C++*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Benchmark Source: Intel Corporation. For more complete information about compiler optimizations, see our Optimization Notice.

Platform Microsoft* Windows 8/ServerHardware Intel® Core™ i7-4770 CPU @ 3.50GHzGraphics Intel® HD Graphics 4600RAM 16GBHDD 1TB

Intel® C++ Compiler Benchmark – Code Offload to Intel® Graphics Technology

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

0%10%20%30%40%50%60%70%80%90%100%

SP

EE

D-U

P

CPU SHARE (100% ALL CPU)

Performance When Offloaded to Graphics Cores

NBodyLocals MoonLight MoonLight_struct MatmultLocalsAN

BoxBlur_Vec BoxBlurFloat BoxBlurFloatLocal FDTD_3d

FishEye Mandelbrot Mandelbrot_bw MatmultLocalsAN_d

NBody geomean

Performance gain up to 3x by

offloading code to graphics cores, at a load balance of 30% on CPU and 70% on graphics cores

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0%

200%

400%

600%

800%

Single-Rate FIR Linear Convolution Cross-Correlation Forward FFT

Intel® IPP Signal Processing

Functions

Speedup

Intel® SSE2 Intel® SSE4.x Intel® AVX2

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Core™ i5-4300U processor, 3 MB Intel® Smart Cache, 8 GB RAM. Operating system: Windows* 8 64-bit, single-threaded benchmark

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Signal Processing

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0%

50%

100%

150%

200%

BZIP2 v. 1.0.6 ZLIB v. 1.2.8, level 6

Intel® IPP data compression

performance boost by using Intel®

IPP vs. open source libraries

Intel® Xeon® E5-2680 Intel® Core™ i7-4770K Intel® Quark™ SoC X1000

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Xeon® E5-2680, 20 MB cache, 2.7 GHz, 64 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Core™ i7-4770K, 8 MB cache, 3.9 GHz, 32 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Quark™ SoC X1000, 16 KB cache, 400 MHz, 2 GB RAM. OS: Yocto Linux 3.8.7, 32-bitData sets: Calgary and Canterbury corpuses

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Data Compression

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0%

50%

100%

150%

200%

BZIP2 v. 1.0.6 ZLIB v. 1.2.8, level 6

Intel® IPP data decompression

performance boost by using Intel®

IPP vs. open source libraries

Intel® Xeon® E5-2680 Intel® Core™ i7-4770K Intel® Pentium® J2900

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Xeon® E5-2680, 20 MB cache, 2.7 GHz, 64 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Core™ i7-4770K, 8 MB cache, 3.9 GHz, 32 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Pentium® J2900, 2 MB cache, 2.7 GHz, 8 GB RAM. OS: RH EL Server 7.0, 64-bitData sets: Calgary and Canterbury corpuses

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Data Decompression

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0.00

1.00

2.00

3.00

4.00

5.00

6.00

AES-128-ECB

Encrypt

AES-128-CBC

Encrypt

AES-128-CBC

Decrypt

SHA-1 SHA-256

GB

YT

ES

/S

Intel® IPP Cryptography Function

Performance

Intel® IPP OpenSSL 1.0.2c

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Core™ i7-4770K, 8 MB cache, 3.9 GHz, 32 GB RAM. OS: RH EL Server 6.4, 64-bit

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Cryptography

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0

500

1000

1500

2000

2500

3000

3500

4000

Box Filter Median Filter

FR

AM

ES

-PE

R-S

EC

ON

D, 1

92

0X

10

80

X1

Intel® IPP Image Processing Filters

Performance In Multi-Thread Mode

1 thread 2 threads 4 threads

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Core™ i5-4300U processor, 3 MB Intel® Smart Cache, 8 GB RAM. Operating system: Windows* 8 64-bit, multi-threaded benchmark

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Image Processing

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

FFT Performance Boost using Intel® MKL vs. FFTW*Single Precision Complex 2D and 3D FFT on Intel® Core™ Processor i7-6700K

41

Configuration Info - Versions: Intel® Math Kernel Library (Intel® MKL) 11.3, FFTW* 3.3.4; Hardware: Intel® Core™ Processor i7-6700K, Quad-core CPU (8MB LLC, 4.0 GHz), 32GB of RAM; Operating System: RHEL 6.5 x86_64;

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark Source: Intel Corporation

Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 .

Single Precision Complex 2D & 3D FFT Performance Boost using Intel® MKL vs. FFTW*

0

50

100

Pe

rfo

rma

nce

(G

Flo

ps)

Transform Size (Power of two)

3D FFT

Intel MKL - 1 thread Intel MKL - 2 threads Intel MKL - 4 threads

FFTW - 1 thread FFTW - 2 threads FFTW - 4 threads

Return to Menu

0

20

40

60

80

100

120

Pe

rfo

rma

nce

(G

Flo

ps)

Transform Size (Power of two)

2D FFT

Intel MKL - 1 thread Intel MKL - 2 threads Intel MKL - 4 threads

FFTW - 1 thread FFTW - 2 threads FFTW - 4 threads

intel® System StudioAdditional Details

42

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

System TraceQuickly Isolate Complex System Issues

43

Efficiently pinpoint issues with time-stamp correlated trace information

Analyze complex interactions between software and hardware

Event trace with time-stamp information

Filter dialog to focus on specific

eventsAvailable with 6th generation Intel® Core™ processor family (formerly code-named Skylake)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® System Studio 2016Advances That Expand Benefits for System and Embedded Developers

44

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

PerformanceKey New Developer Benefits / Capability

Enable and optimize compelling system and application usagesPerformance improved compilers and libraries

Analyze application performance on preemptive RT Linux*Performance analyzer supports real-time Linux* system profiling

Analyze application performance in virtualized environmentsPerformance analyzer supports virtualized environment performance profiling

Quickly identify software that is wasting powerEnergy profiler adds support for Linux* targets

Quickly isolate complex system issuesComprehensive system-wide hardware and software event tracing

System-wide closed chassis debuggingJTAG-based debug and trace over low-cost USB connection

Extended insight into Windows* system to strengthen reliabilitySystem debug and trace extensions for Microsoft* WinDbg* kernel debugger

Effectively debug compute intensive code offloaded to graphics coresDebugger for offloaded code

Support newest platformsAdded support for new Intel processors and target operating systems

Enhanced developer productivityImproved out-of-the-box experience, IDE and samples included