An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

download An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

of 29

Transcript of An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    1/29

    1

    Software EngineeringSeminar

    Sebastian Hafen

    An Auto-Tuning Framework

    for Parallel Multicore Stencil Computations

    Shoaib Kamil C! Chan "eoni# $liker %ohn Shalf Samuel &illiams

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    2/29

    2

    Stencils

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    3/29

    3

    &hat is a Stencil Computation'

    (earest (eighbor Computations E)g) finite #ifference between #ata points

    Sweeps o*er a structure# +ri#

    "ike a n-#imensional Arra! ,terati*e i . i/0 . i/1

    "eft Two http22iopscience)iop)org20345-4655212027087782fullte9t

    Mi##le http22en)wikipe#ia)org2wiki2Stencil:;numerical:anal!sis 8-Points-Stencil

    //Stencil-loop

    dok=2, xLength-1, 1

    doi=2, yLength-1, 1

    writeArray[k][i] = useStencil(k,i)

    enddo

    enddo

    //Stencil-function

    functionuseStencil(k,i)

    intresult = readArray[k][i]

    + readArray[k+1][i]

    + readArray[k-1][i]

    + readArray[k][i+1]

    + readArray[k][i-1]

    result = result/5

    returnresult

    endfunction

    ;k/0ie*eloper2books2$r$n1:PfTune2sgi:html2ch76)html

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    19/29

    19

    Search Engine

    =uns all the #ifferent tune# *ersions of the stencil kernel 186?gri#s ;06J333J106 Elements< initialiIe# with ran#om *alues

    ser can replace the original kernel with the fastest one

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    20/29

    20

    "imitations

    $nl! 1> or ?>

    $nl! Arra!s

    (o sophisticate# >ata structures

    $nl! arithmetic stencils

    The! want to change that in future work

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    21/29

    21

    Co#e +enerator

    Creates co#e from the mo#ifie# ASTs For the CPs pthrea#s

    For the +P C>A threa# blocks

    Serial fortran an# c co#e also possible

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    22/29

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    23/29

    23

    se# Stencils"aplacian Stencil >i*ergence Stencil +ra#ient Stencil

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    24/29

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    25/29

    25

    =esults

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    26/29

    26

    $ne =esult"aplacian

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    27/29

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    28/29

    28

    Conclusion

    Pro ,t #oes work) Concept is pro*en

    Full! general

    Performance comparable to han#-optimiIe# co#e

    DProgrammer Pro#uction enefits

    Few minutes to annotate co#e

    Contra

    $penMP works goo# too

    (ew architecture means new co#ing Peak not !et reache#

    uote from Paper

  • 8/10/2019 An Auto-Tunning Framework for Parallel Multicore Stencil Calculations

    29/29

    29

    En# of Presentation