CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword:...
Transcript of CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword:...
![Page 1: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/1.jpg)
CUDA Synchronization
![Page 2: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/2.jpg)
atomics
2
![Page 3: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/3.jpg)
memory fences
– Robert Frost, “Mending Wall”
“Good fences make good neighbors.”
3
![Page 4: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/4.jpg)
without fences
same (lack of) guarantees for reads
4
![Page 5: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/5.jpg)
ORLY?
5
![Page 6: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/6.jpg)
__threadfence_block
6
![Page 7: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/7.jpg)
PTX membar.cta
7
![Page 8: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/8.jpg)
__threadfence
8
![Page 9: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/9.jpg)
PTX membar.gl
9
![Page 10: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/10.jpg)
__threadfence_system
10
![Page 11: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/11.jpg)
volatile
11
![Page 12: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/12.jpg)
CUDA spinlock?
12
![Page 13: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/13.jpg)
– “Robert Frost”, The Thread Not Taken
“Two threads diverged in a CUDA warp,And sorry I had become untwinedfrom my PC by a branch so sharp,
I had to ask Nvidia Corp:the order of paths was undefined.”
13
![Page 14: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/14.jpg)
__syncthreads()
14
![Page 15: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/15.jpg)
intra-warp synchronization
15
![Page 16: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/16.jpg)
<spinlock PTX demo>
![Page 17: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/17.jpg)
PTX
❖ virtual ISA for Nvidia GPUs
❖ RISC-like ISA, load-store, 3-operand
❖ destination register is on the left
![Page 18: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/18.jpg)
PTX load/store cachingqualifier meaning load store
.ca cache at all levels default
.wb write back caching default
.cg cache at L2 (global cache) yes yes
.cs streaming (mark as LRU) yes yes
.lu last use (read & invalidate) yes
18
![Page 19: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/19.jpg)
PTX tidbits
19
![Page 20: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/20.jpg)
Homework 2
![Page 21: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/21.jpg)
CUDA Kernel Timeout == good
21
![Page 22: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/22.jpg)
__managed____device__ __managed__ int d_counter = 0;
void main() { d_counter = 10;
myKernel<<<8,16>>>();
cudaStatus = cudaDeviceSynchronize(); checkCudaErrors(cudaStatus);
printf(“%d”, d_counter); }
22
![Page 23: CUDA Synchronization - cis.upenn.edudevietti/classes/cis601-spring2017/cuda... · volatile keyword: If a variable located in global or shared memory is declared as volatile, the compiler](https://reader036.fdocuments.in/reader036/viewer/2022062923/5f0be57d7e708231d432c084/html5/thumbnails/23.jpg)
C++ virtual functions
23