"Distributed Computing and Grid-technologies in Science and Education "
description
Transcript of "Distributed Computing and Grid-technologies in Science and Education "
![Page 1: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/1.jpg)
"Distributed Computing and Grid-technologies in Science and Education"
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMS
Klimov Georgy
Dubna, 2012
![Page 2: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/2.jpg)
AGENDA
• Grid & GPU• GPU architecture• CUDA technologies• Grid-projects with GPU using• Monotonic Basin Hopping method• CUDA-realization of MBH• Further investigations plan• Summary
![Page 3: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/3.jpg)
Grid & GPU
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
GPU advantages:• ~33% of all PCs are equipped
with modern GPU (~60% - Nvidia)
• Common usage of GPU resources <5% (HD film)
• GPU optimized for working with huge textures arrays
• Modern GPUs consist of tens or even hundreds cores. It means great performance for some kinds of tasks
Problems, solving by Grid:
• effective using of existing resources
• working with huge data arrays
• providing high performance
![Page 4: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/4.jpg)
GPU architecture
•scalable array of ТРС •with it’s own DRAM
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
• 8 Scalar Processors• 2 Special Functions Units• Double Precision Unit• Register File• Shared Memory• Texture Memory Cache• Constant Memory Cache
![Page 5: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/5.jpg)
CUDA technology
CUDA – Compute Unified Device Architecture
• Supports all NVidia GPUs starting from GeForce 8-x series
• Low level access to the hardware - graphics API knowledge not required
• CUDA programming language is based on C/C++ syntax – easier porting of existing code
• Greater performance comparing to OpenCL (50-100% performance increase in different researches)
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
![Page 6: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/6.jpg)
CUDA technology
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
CUDA programming model
![Page 7: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/7.jpg)
CUDA technology
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
CUDA threads hierarchy• Threads groups in Blocks (1, 2 or 3-dim)
• Blocks groups in Grid (1 or 2-dim)
• Treads within Block:Sharing data through shared memory
Synchronizing their execution
• Threads from different blocks operate independently
• Built variables threadIdx, blockIdx etc.
![Page 8: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/8.jpg)
CUDA technology
Memory type Access Level SpeedRegisters R/W Per-thread High (on chip)Local R/W Per-thread Low (DRAM)Shared R/W Per-block High (on chip)Global R/W Per-grid Low (DRAM)Constant R/O Per-grid High (L1 cache)Texture R/O Per-grid High (L1 cache)
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
CUDA memory hierarchy
![Page 9: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/9.jpg)
Grid-projects with GPU using
GPUgrid.net - volunteer distributed computing project for biomedical research from the Universitat Pompeu Fabra in Barcelona (Spain)
Collatz Conjecture - research in mathematics, specifically testing the Collatz Conjecture also known as 3x+1 or HOTPO (half or triple plus one).
PrimeGrid - to bring the excitement of prime finding
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
![Page 10: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/10.jpg)
Monotonic Basin Hopping method
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
1. Start from point x0 2. Repeat until the stop condition:
2.1. generate point Φ(x) 2.2. apply the local minimization algorithm to the point Φ (x) → get point x1. 2.3. if f (x1 ) < f (x) , then x = x1
3. Return x
Algorithm steps:
* Gradient descent was used as local minimization algorithm
![Page 11: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/11.jpg)
CUDA-realization of MBH
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
Ymin
Ymax
I, j
Xmin Xmax
• Divide the research area into equal square areas
• Each thread implements the algorithm in it’s area
• Find minimum among the results of each thread
![Page 12: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/12.jpg)
CUDA-realization of MBH
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
GPU1 - Tesla 10:max threads per block = 512max threads per dim = 512max blocks per dim = 65535number of multiproc = 30
GPU2 - GeForce GT 525M:max threads per block = 1024 max threads per dim = 1024 max blocks per dim = 65535 number of multiproc = 2
CPU - Intel core2duo T6400 number of cores = 2 Clock speed = 2 GHz
Used hardware:
![Page 13: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/13.jpg)
CUDA-realization of MBH
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
• Four parameters: the radius of the “jump” of the algorithm MBH - r, the maximum number of steps in the cycle - N, the number of blocks launched - Nb and the number of threads per block - Nt
• Set Nb and Nt• The radius r is calculated as half of a square area diametr• The number of cycle’s steps N is determined a result of the experiment *• 4 test functions were selected: Ackley, Griewank, Rastrigin, Shubert
Methodology of the experiment
1. The result is considered valid if it differs from the tabular less than 0.001
2. The result is considered valid if an average of 9 times out of 10 gives the right within the specified accuracy of the answer
3. The time averaged over 20 runs of the program
![Page 14: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/14.jpg)
CUDA-realization of MBH
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
AVG executing timeCPU 160 sec
GeForce GT 525M 35 sec
Tesla 10 1.5 sec
Results for Ackley function
Number of treads per block Number of treads per block
block
blocks
blocks
blocks
block
blocks
blocks
blocksM
inim
al ti
me
of fi
ndin
g ex
trem
um, s
ec
Min
imal
tim
e of
find
ing
extr
emum
, sec
![Page 15: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/15.jpg)
CUDA-realization of MBH
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
Results for Griewank function
AVG executing time
CPU 155 sec
GeForce GT 525M 33 sec
Tesla 10 2.2 sec
Number of treads per blockNumber of treads per block
block
blocks
blocks
blocks
block
blocks
blocks
blocksM
inim
al ti
me
of fi
ndin
g ex
trem
um, s
ec
Min
imal
tim
e of
find
ing
extr
emum
, sec
![Page 16: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/16.jpg)
CUDA-realization of MBH
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
Results for Rastrigin function
AVG executing time
CPU 125 sec
GeForce GT 525M 28.5 sec
Tesla 10 2.0 sec
Number of treads per blockNumber of treads per block
block
blocks
blocks
blocks
block
blocks
blocks
blocksM
inim
al ti
me
of fi
ndin
g ex
trem
um, s
ec
Min
imal
tim
e of
find
ing
extr
emum
, sec
![Page 17: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/17.jpg)
CUDA-realization of MBH
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
Results for Shubert function
AVG executing time
CPU 300 sec
GeForce GT 525M 82 sec
Tesla 10 4.3 sec
block
blocks
blocks
blocks
Number of treads per block Number of treads per block
block
blocks
blocks
blocksM
inim
al ti
me
of fi
ndin
g ex
trem
um, s
ec
Min
imal
tim
e of
find
ing
extr
emum
, sec
![Page 18: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/18.jpg)
Further investigations plan
• Use more complicated and accurate local optimization methods
• Uprgrade method of parallization• Improve algorithm of MBH “jump” set-up• Build solution for Molecular cluster modeling
based on MBH method• Integrate CUDA-solution to BNB-Grid project• Describe class of functions that can be
effectively processed on GPUs
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
![Page 19: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/19.jpg)
Summary
• There are huge share of GPUs among PCs• GPU is a multicore system• CUDA is one of the technologies that provides
great performance of GPU calculations• There are a number of Grid-projects that
already use CUDA• Tests shows that in some cases GPU perform
5-100 times better than CPU
PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMSKlimov G., CMC MSU 2012
![Page 20: "Distributed Computing and Grid-technologies in Science and Education "](https://reader036.fdocuments.in/reader036/viewer/2022062814/568166fc550346895ddb5feb/html5/thumbnails/20.jpg)
THANKS FOR YOUR ATTENTION!