Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.
-
date post
19-Dec-2015 -
Category
Documents
-
view
221 -
download
0
Transcript of Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.
![Page 1: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/1.jpg)
Tools for Investigating Graphics System Performance
Matthew FisherSteve Pronovost
![Page 2: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/2.jpg)
Goal
A video game runs slowly, skips frames, has high latency, etc. and the developers want to fix this problem
The problem is almost always a cascade of bottlenecks at the application, CPU, and GPU levels that is very challenging to investigate locally
We want tools that lets programmers solve these problems faster
![Page 3: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/3.jpg)
Approaches
Profiling– Rig the game events with logging or use an
automatic profiler PIX (for Windows and Xbox 360)– All calls by the game to the graphics API are
logged GPUView– OS logs all CPU, graphics kernel and graphics
driver events
![Page 4: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/4.jpg)
Profiling
Manual profiling requires a significant amount of development effort
Polling-based automatic profiling can work reasonably well for CPU applications but doesn’t capture graphics or memory transfer events well
Percentage-based statistics (“you spent 45% of the time in function X”) can sometimes be useful and sometimes extremely misleading
![Page 5: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/5.jpg)
PIX
Released by Microsoft as part of the DirectX SDK
Multiple modes for investigating performance targeted at game developers– Interactive mode– Frame logging– Frame capture and playback
![Page 6: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/6.jpg)
PIX – Interactive Mode
Various counters stream by as the game runs You can change the counters, hope is to find
that the observed problem correlates with one of the counters
![Page 7: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/7.jpg)
PIX – Interactive Mode
![Page 8: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/8.jpg)
Commonly Used Counter Types
Number, type, and size of draw primitive calls Number of texture, vertex/index buffer locks,
and what memory pool was locked Object creation and destruction events Allocated system and video memory Frame latency, seconds per frame Page faults
![Page 9: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/9.jpg)
PIX – Frame Capture Mode
![Page 10: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/10.jpg)
PIX – Debug Pixel
![Page 11: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/11.jpg)
Questions PIX is good at
Are object locks causing the frame skipping problem users are experiencing?
Are we allocating too many resources we don’t use?
What are the API calls that are taking the longest time to execute?
Why was this pixel in the sky green?
![Page 12: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/12.jpg)
GPUView
![Page 13: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/13.jpg)
Windows Display Driver Model
The XP Display Driver Model required applications to cede control of the graphics infrastructure and was largely designed assuming a single 3D application would be running
The Vista Display Driver Model added standard scheduling principles forcing applications to share control of graphics memory and compute resources
![Page 14: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/14.jpg)
GPUView
The graphics model switch induced a variety of constraints on graphics applications and forced highly optimized graphics drivers to be restructured
Many games were running more slowly on Vista than they did on XP (~5% - 30% slower)
GPUView was designed to help investigate these problems and see what stage was causing the speed difference
![Page 15: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/15.jpg)
Event Tracing
The GPUView logger enables logging of a vast set of events in the OS, such as– All calls to the Windows graphics kernel• All resource creation, lock, destruction, etc. events• All command buffer submissions
– Context switches (w/ stack trace and reason)– Kernel mode enter/exits (w/ stack trace)
World of Warcraft generates approximately 1GB every 3 seconds
![Page 16: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/16.jpg)
GPUView Without Any Graphics
![Page 17: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/17.jpg)
Windows Display Driver Model
Applications build up local command buffers When these command buffers get big enough
they are submitted to the application’s local graphics queue for processing
The graphics scheduler selects which application should be running on which graphics card and submits work to the corresponding hardware queue
![Page 18: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/18.jpg)
One Second of a Game
![Page 19: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/19.jpg)
![Page 20: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/20.jpg)
![Page 21: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/21.jpg)
Setup
![Page 22: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/22.jpg)
Multiple Applications Fighting
![Page 23: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/23.jpg)
Simple Problems
![Page 24: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/24.jpg)
Relatively Normal Execution
![Page 25: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/25.jpg)
GPU Starvation
![Page 26: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/26.jpg)
GPU Idle
![Page 27: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/27.jpg)
Sleepy App
![Page 28: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/28.jpg)
Huge Render Times (GPU Bound)
![Page 29: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/29.jpg)
GPU and CPU Starvation
![Page 30: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/30.jpg)
Answering Questions
![Page 31: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/31.jpg)
Why Did Our Thread Context Switch?
![Page 32: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/32.jpg)
Does Surface Allocation Cause Frame Stuttering?
![Page 33: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/33.jpg)
Thoughts
Surprisingly, the overhead of GPUView logging is pretty minimal and the traces often reflect the underlying problem well
The biggest advantage of GPUView over PIX is that PIX can’t tell you crucial things like when the GPU is blocked on the CPU
GPUView is excellent for telling you what part of the application needs optimization
![Page 34: Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost.](https://reader030.fdocuments.in/reader030/viewer/2022012910/56649d3e5503460f94a16aea/html5/thumbnails/34.jpg)
Driver Perspective
Provides a lot of detail to let display driver writers and the DirectX graphics kernel diagnose problems with task submission, the command buffer submission threads, GPU preemption, video skipping, etc.