An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
Transcript of An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
1/29
1
Software EngineeringSeminar
Sebastian Hafen
An Auto-Tuning Framework
for Parallel Multicore Stencil Computations
Shoaib Kamil C! Chan "eoni# $liker %ohn Shalf Samuel &illiams
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
2/29
2
Stencils
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
3/29
3
&hat is a Stencil Computation'
(earest (eighbor Computations E)g) finite #ifference between #ata points
Sweeps o*er a structure# +ri#
"ike a n-#imensional Arra! ,terati*e i . i/0 . i/1
"eft Two http22iopscience)iop)org20345-4655212027087782fullte9t
Mi##le http22en)wikipe#ia)org2wiki2Stencil:;numerical:anal!sis 8-Points-Stencil
//Stencil-loop
dok=2, xLength-1, 1
doi=2, yLength-1, 1
writeArray[k][i] = useStencil(k,i)
enddo
enddo
//Stencil-function
functionuseStencil(k,i)
intresult = readArray[k][i]
+ readArray[k+1][i]
+ readArray[k-1][i]
+ readArray[k][i+1]
+ readArray[k][i-1]
result = result/5
returnresult
endfunction
;k/0ie*eloper2books2$r$n1:PfTune2sgi:html2ch76)html
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
19/29
19
Search Engine
=uns all the #ifferent tune# *ersions of the stencil kernel 186?gri#s ;06J333J106 Elements< initialiIe# with ran#om *alues
ser can replace the original kernel with the fastest one
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
20/29
20
"imitations
$nl! 1> or ?>
$nl! Arra!s
(o sophisticate# >ata structures
$nl! arithmetic stencils
The! want to change that in future work
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
21/29
21
Co#e +enerator
Creates co#e from the mo#ifie# ASTs For the CPs pthrea#s
For the +P C>A threa# blocks
Serial fortran an# c co#e also possible
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
22/29
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
23/29
23
se# Stencils"aplacian Stencil >i*ergence Stencil +ra#ient Stencil
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
24/29
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
25/29
25
=esults
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
26/29
26
$ne =esult"aplacian
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
27/29
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
28/29
28
Conclusion
Pro ,t #oes work) Concept is pro*en
Full! general
Performance comparable to han#-optimiIe# co#e
DProgrammer Pro#uction enefits
Few minutes to annotate co#e
Contra
$penMP works goo# too
(ew architecture means new co#ing Peak not !et reache#
uote from Paper
-
8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations
29/29
29
En# of Presentation