Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC...
Transcript of Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC...
![Page 1: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/1.jpg)
1© 2017 Rogue Wave Software, Inc. All Rights Reserved. 1
Advanced Technologies and
Techniques for Debugging
CUDA GPU HPC Applications
Nikolay Piskun
Director of Continuing Engineering , TotalView products
GTC March, 2019
![Page 2: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/2.jpg)
2© 2017 Rogue Wave Software, Inc. All Rights Reserved. 2
Agenda
• What is debugging and why TotalView?
• Overview of TotalView
• GPU debugging
• Python debugging
• Advanced C++ and Data debugging
• TotalView resources and documentation
• Questions/Comments
![Page 3: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/3.jpg)
3© 2017 Rogue Wave Software, Inc. All Rights Reserved. 3
What is Debugging andWhy do you need TotalView?
![Page 4: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/4.jpg)
4© 2018 Rogue Wave Software, Inc. All Rights Reserved. 4
What is Debugging?
• Debugging is the process of finding and resolving defects or problems
within a computer program or a system.
–Algorithm correctness
–Data correctness
– Scaling/Porting correctness
![Page 5: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/5.jpg)
5© 2018 Rogue Wave Software, Inc. All Rights Reserved. 5
TotalView debugger enables you to do:
• Interactive debugging
– Live control of an executing program
• Remote debugging
» Debug a program running on another computer
• Post-mortem debugging (core files and reverse debugging)
– Debugging a program after it has crashed or exited
• Memory debugging
» Find memory management problems (leaks, corruption …)
» Comparing results between executions
• Batch debugging (tvscript, CI environments)
– Unattended debugging
![Page 6: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/6.jpg)
6© 2018 Rogue Wave Software, Inc. All Rights Reserved. 6
History of TotalView
0
2
4
6
8
10
12
14
1986 2003 2004 2006 2007 2011 2012 2018
Number of architectures supported
![Page 7: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/7.jpg)
7© 2017 Rogue Wave Software, Inc. All Rights Reserved. 7
TotalView for HPC and for All • Leading debug environment for HPC users
– Active development for 30+ years
– Thread specific breakpoints
– Control individual thread execution
– View complex data types easily
– From MacBook to Top500 Supercomputers
• Track memory leaks in running applications
• Supports C/C++ and Fortran on Linux/Unix/Mac
• Integrated Reverse debugging
• Batch non-interactive debugging.
• Allowing the business to have
– Predictable development schedules
– Less time spent debugging
Nvidia software partner
![Page 8: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/8.jpg)
8© 2018 Rogue Wave Software, Inc. All Rights Reserved. 8
High Performance Computing Applications
• HPC Applications cover many different computing areas including:
– Healthcare and Medicine
– Modeling and simulation
– Security
– Bioinformatics
– Molecular Dynamics
– Environment (earthquake/tsunami)/Weather
– Machine Learning/Artificial Intelligence
![Page 9: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/9.jpg)
9© 2018 Rogue Wave Software, Inc. All Rights Reserved. 9
GPU Debugging
![Page 10: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/10.jpg)
10© 2017 Rogue Wave Software, Inc. All Rights Reserved. 10
GPU debugging with TotalView
• NVIDIA CUDA support
– Multiple platforms : X86-64, PowerLE, ARM64 (in beta)
– Multiple cards: from Jetson to Volta (Turing testing)
• Features and capabilities include
– Support for dynamic parallelism
– Support for MPI based clusters and multi-card configurations
– Flexible Display and Navigation on the CUDA device
• Physical (device, SM, Warp, Lane)
• Logical (Grid, Block) tuples
– CUDA device window reveals what is running where
– Support for CUDA Core debugging
– Leverages CUDA memcheck
– Support for OpenACC
![Page 11: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/11.jpg)
11© 2017 Rogue Wave Software, Inc. All Rights Reserved. 11
CUDA Debugging Model Improvements
• First in class Unified
Source debugging
• Improves and
streamlines debugging
CUDA applications
• Set breakpoints in CPU and GPU
kernel code before it is launched on
the GPU
• Compare variables in CPU
and GPU code together
![Page 12: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/12.jpg)
12© 2017 Rogue Wave Software, Inc. All Rights Reserved. 12
CUDA Debugging Demo
![Page 13: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/13.jpg)
13© 2017 Rogue Wave Software, Inc. All Rights Reserved. 13
![Page 14: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/14.jpg)
14© 2017 Rogue Wave Software, Inc. All Rights Reserved. 14
Python/C++ Debugging
![Page 15: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/15.jpg)
15© 2017 Rogue Wave Software, Inc. All Rights Reserved. 15
Debugging multiple languages
• Debugging one language is difficult enough
– Especially with many threads/processes
• The language intersection is tougher
– Data comparison
– Glue code
• Issues are:
– Type mismatches
– Extraneous stack frames
![Page 16: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/16.jpg)
16© 2017 Rogue Wave Software, Inc. All Rights Reserved. 16
Why Python ?
• Use Python to build applications that call out to C++
• Provides access to
– High-performance routines
– Leverage existing algorithms and libraries
– Utilize advanced multi-threaded capabilities
• Calling between languages easily enabled using technologies such as SWIG, ctypes, Cython, CFFI, et al
• Debugging mixed language applications is not easy
– Good for debugger developers ☺
![Page 17: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/17.jpg)
17© 2018 Rogue Wave Software, Inc. All Rights Reserved. 17
Python debugging with TotalView
• What TotalView provides:
– Easy Python debugging session setup
– Fully integrated Python and C/C++ call stack
• ”Glue” layers between the languages removed
– Easily examine and compare variables in Python and C++
– Utilize reverse debugging and memory debugging
• What TotalView does not provide (yet):
– Setting breakpoints and stepping within Python code
![Page 18: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/18.jpg)
18© 2017 Rogue Wave Software, Inc. All Rights Reserved. 18
Demo#!/usr/bin/python
def callFact():
import tv_python_example as tp
a = 3
b = 10
c = a+b
ch = “local string”
……
return tp.fact(a)
if __name__ == '__main__’:
b = 2
result = callFact()
print result
![Page 19: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/19.jpg)
19© 2018 Rogue Wave Software, Inc. All Rights Reserved. 19
Python Debugging Demo
totalview -args python test_python_types.py
![Page 20: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/20.jpg)
20© 2017 Rogue Wave Software, Inc. All Rights Reserved. 20
Python without special debugger support
Glue
code
No viewing of Python
data and code
![Page 21: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/21.jpg)
21© 2017 Rogue Wave Software, Inc. All Rights Reserved. 21
Showing C code with mixed data
Glue code filtered out
Python data and code
available for viewing
Shows Python & C++
C++ data
Py data
![Page 22: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/22.jpg)
22© 2017 Rogue Wave Software, Inc. All Rights Reserved. 22
Stack Transformation Facility
• Hides stack frames
• Transforms stack frames
• Backbone for:
– Python support
– OpenMP support
• Useful for any glue code you want to hide
– Language differences
– Wrapper code
![Page 23: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/23.jpg)
23© 2017 Rogue Wave Software, Inc. All Rights Reserved. 23
TensorFlow basics
• Open source
• Numerical computation
• Usage in machine learning
• Written in C++
– Called from Python
![Page 24: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/24.jpg)
24© 2017 Rogue Wave Software, Inc. All Rights Reserved. 24
TensorFlow
Multi-threaded application
Glue code removed
Added a rule for wrappers
![Page 25: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/25.jpg)
25© 2017 Rogue Wave Software, Inc. All Rights Reserved. 25
Advanced C++ and Data Debugging
![Page 26: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/26.jpg)
26© 2017 Rogue Wave Software, Inc. All Rights Reserved. 26
Advanced C++ and Data Debugging
Instead
of This
• TotalView transforms many of
the C++ and STL containers
such as:
– array, forward_list,
tuple, map, set, vector
and others.
• TotalView supports debugging the latest
C++11/14 features including:
– lambdas, transformations for smart
pointers, auto types, R-Value references,
range-based loops, strongly-typed
enums, initializer lists, user defined literals
See
This!
![Page 27: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/27.jpg)
27© 2017 Rogue Wave Software, Inc. All Rights Reserved. 27
Array Slicing, Striding and Filtering (classic UI)
• Slicing – reduce display to a portion of the array
– [lower_bound:upper_bound]
– [5:10]
• Striding – Skip over elements
– [::stride]
– [::5], [5:10:-1]
• Filtering
– Comparison: ==, !=, <,
<=, >, >=
– Range of values: [>] low-
value : [<] high-value
– IEEE values: $nan, $inf,
$denorm
![Page 28: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/28.jpg)
28© 2017 Rogue Wave Software, Inc. All Rights Reserved. 28
Array Statistics
• Easily display a set of statistics for the
filtered portion of your array
![Page 29: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/29.jpg)
29© 2017 Rogue Wave Software, Inc. All Rights Reserved. 29
Visualizing Array Data
• Visualizer creates graphic images of your
program’s array data.
• Visualize one or two dimensional arrays
• View data manually through the Window
> Visualize command on the Data
Window
• Visualize data programmatically using
the $visualize function
![Page 30: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/30.jpg)
30© 2017 Rogue Wave Software, Inc. All Rights Reserved. 30
Summary
• Use of modern debugger saves you time.
• TotalView can help you because:
– It’s cross-platform (the only debugger you ever need)
– Allow you to debug accelerators (GPU) and CPU in one session
– Allow you to debug multiple languages (C++/Python/Fortran)
![Page 31: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/31.jpg)
31© 2017 Rogue Wave Software, Inc. All Rights Reserved. 31
Presenter
Nikolay Piskun
Director of CE, TotalView products
Rogue Wave Software/Perforce Software
![Page 32: Advanced Technologies and Techniques for Debugging CUDA ......Techniques for Debugging CUDA GPU HPC Applications Nikolay Piskun ... •HPC Applications cover many different computing](https://reader030.fdocuments.in/reader030/viewer/2022040414/5f1f4db39f1c3c11bf039026/html5/thumbnails/32.jpg)
32© 2017 Rogue Wave Software, Inc. All Rights Reserved. 32