Language and Compiler Support for Auto-Tuning Variable...
Transcript of Language and Compiler Support for Auto-Tuning Variable...
![Page 1: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/1.jpg)
Language and Compiler Support for Auto-TuningVariable-Accuracy Algorithms
Jason Ansel Yee Lok Wong Cy Chan Marek OlszewskiAlan Edelman Saman Amarasinghe
MIT - CSAIL
April 4, 2011
Jason Ansel (MIT) PetaBricks April 4, 2011 1 / 30
![Page 2: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/2.jpg)
Outline
1 Motivating Example
2 PetaBricks Language Overview
3 Variable Accuracy
4 Autotuner
5 Results
6 Conclusions
Jason Ansel (MIT) PetaBricks April 4, 2011 2 / 30
![Page 3: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/3.jpg)
A motivating example
How would you write a fast sorting algorithm?
Insertion sortQuick sortMerge sortRadix sortBinary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort,Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort,Heapsort, Introsort, Library sort, Odd-even sort, Postman sort,Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort,Timsort?
Poly-algorithms
Jason Ansel (MIT) PetaBricks April 4, 2011 3 / 30
![Page 4: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/4.jpg)
A motivating example
How would you write a fast sorting algorithm?
Insertion sortQuick sortMerge sortRadix sort
Binary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort,Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort,Heapsort, Introsort, Library sort, Odd-even sort, Postman sort,Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort,Timsort?
Poly-algorithms
Jason Ansel (MIT) PetaBricks April 4, 2011 3 / 30
![Page 5: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/5.jpg)
A motivating example
How would you write a fast sorting algorithm?
Insertion sortQuick sortMerge sortRadix sortBinary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort,Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort,Heapsort, Introsort, Library sort, Odd-even sort, Postman sort,Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort,Timsort?
Poly-algorithms
Jason Ansel (MIT) PetaBricks April 4, 2011 3 / 30
![Page 6: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/6.jpg)
A motivating example
How would you write a fast sorting algorithm?
Insertion sortQuick sortMerge sortRadix sortBinary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort,Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort,Heapsort, Introsort, Library sort, Odd-even sort, Postman sort,Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort,Timsort?
Poly-algorithms
Jason Ansel (MIT) PetaBricks April 4, 2011 3 / 30
![Page 7: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/7.jpg)
std::stable sort
/usr/include/c++/4.5.2/bits/stl algo.h lines 3350-3367
Jason Ansel (MIT) PetaBricks April 4, 2011 4 / 30
![Page 8: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/8.jpg)
std::stable sort
/usr/include/c++/4.5.2/bits/stl algo.h lines 3350-3367
Jason Ansel (MIT) PetaBricks April 4, 2011 4 / 30
![Page 9: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/9.jpg)
std::sort
/usr/include/c++/4.5.2/bits/stl algo.h lines 2163-2167
Why 16? Why 15?
Dates back to at least 2000 (Jun 2000 SGI release)
Still in current C++ STL shipped with GCC
10+ years of of S threshold = 16
Jason Ansel (MIT) PetaBricks April 4, 2011 5 / 30
![Page 10: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/10.jpg)
std::sort
/usr/include/c++/4.5.2/bits/stl algo.h lines 2163-2167
Why 16? Why 15?
Dates back to at least 2000 (Jun 2000 SGI release)
Still in current C++ STL shipped with GCC
10+ years of of S threshold = 16
Jason Ansel (MIT) PetaBricks April 4, 2011 5 / 30
![Page 11: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/11.jpg)
std::sort
/usr/include/c++/4.5.2/bits/stl algo.h lines 2163-2167
Why 16? Why 15?
Dates back to at least 2000 (Jun 2000 SGI release)
Still in current C++ STL shipped with GCC
10+ years of of S threshold = 16
Jason Ansel (MIT) PetaBricks April 4, 2011 5 / 30
![Page 12: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/12.jpg)
Is 15 the right number?
The best cutoff (CO) changes
Depends on competing costs:
Cost of computation (< operator, call overhead, etc)Cost of communication (swaps)Cache behavior (misses, prefetcher, locality)
Sorting 100000 doubles with std::stable sort:
CO ≈ 200 optimal on a Phenom 905e (15% speedup over CO = 15)CO ≈ 400 optimal on a Opteron 6168 (15% speedup over CO = 15)CO ≈ 500 optimal on a Xeon E5320 (34% speedup over CO = 15)CO ≈ 700 optimal on a Xeon X5460 (25% speedup over CO = 15)
Compiler’s hands are tied, it is stuck with 15
Jason Ansel (MIT) PetaBricks April 4, 2011 6 / 30
![Page 13: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/13.jpg)
Is 15 the right number?
The best cutoff (CO) changes
Depends on competing costs:
Cost of computation (< operator, call overhead, etc)Cost of communication (swaps)Cache behavior (misses, prefetcher, locality)
Sorting 100000 doubles with std::stable sort:
CO ≈ 200 optimal on a Phenom 905e (15% speedup over CO = 15)CO ≈ 400 optimal on a Opteron 6168 (15% speedup over CO = 15)CO ≈ 500 optimal on a Xeon E5320 (34% speedup over CO = 15)CO ≈ 700 optimal on a Xeon X5460 (25% speedup over CO = 15)
Compiler’s hands are tied, it is stuck with 15
Jason Ansel (MIT) PetaBricks April 4, 2011 6 / 30
![Page 14: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/14.jpg)
Back to our motivating example
How would you write a fast sorting algorithm?
Insertion sortQuick sortMerge sortRadix sortBinary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort,Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort,Heapsort, Introsort, Library sort, Odd-even sort, Postman sort,Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort,Timsort?
Poly-algorithms
Answer
It depends!
Jason Ansel (MIT) PetaBricks April 4, 2011 7 / 30
![Page 15: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/15.jpg)
Back to our motivating example
How would you write a fast sorting algorithm?
Insertion sortQuick sortMerge sortRadix sortBinary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort,Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort,Heapsort, Introsort, Library sort, Odd-even sort, Postman sort,Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort,Timsort?
Poly-algorithms
Answer
It depends!
Jason Ansel (MIT) PetaBricks April 4, 2011 7 / 30
![Page 16: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/16.jpg)
Autotuned parallel sorting algorithms
On a Xeon E7340 (2× 4 cores)1 Insertion sort below 6002 Quick sort below 14203 2-way parallel merge sort
On a Sun Fire T200 Niagara (8 cores)1 16-way merge sort below 752 8-way merge sort below 14613 4-way merge sort below 24004 2-way parallel merge sort
235% slowdown running Niagara algorithm on the Xeon
8% slowdown running Xeon algorithm on the Niagara
Need a way to express these algorithmic choices to enable autotuning
Jason Ansel (MIT) PetaBricks April 4, 2011 8 / 30
![Page 17: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/17.jpg)
Autotuned parallel sorting algorithms
On a Xeon E7340 (2× 4 cores)1 Insertion sort below 6002 Quick sort below 14203 2-way parallel merge sort
On a Sun Fire T200 Niagara (8 cores)1 16-way merge sort below 752 8-way merge sort below 14613 4-way merge sort below 24004 2-way parallel merge sort
235% slowdown running Niagara algorithm on the Xeon
8% slowdown running Xeon algorithm on the Niagara
Need a way to express these algorithmic choices to enable autotuning
Jason Ansel (MIT) PetaBricks April 4, 2011 8 / 30
![Page 18: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/18.jpg)
Autotuned parallel sorting algorithms
On a Xeon E7340 (2× 4 cores)1 Insertion sort below 6002 Quick sort below 14203 2-way parallel merge sort
On a Sun Fire T200 Niagara (8 cores)1 16-way merge sort below 752 8-way merge sort below 14613 4-way merge sort below 24004 2-way parallel merge sort
235% slowdown running Niagara algorithm on the Xeon
8% slowdown running Xeon algorithm on the Niagara
Need a way to express these algorithmic choices to enable autotuning
Jason Ansel (MIT) PetaBricks April 4, 2011 8 / 30
![Page 19: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/19.jpg)
Autotuned parallel sorting algorithms
On a Xeon E7340 (2× 4 cores)1 Insertion sort below 6002 Quick sort below 14203 2-way parallel merge sort
On a Sun Fire T200 Niagara (8 cores)1 16-way merge sort below 752 8-way merge sort below 14613 4-way merge sort below 24004 2-way parallel merge sort
235% slowdown running Niagara algorithm on the Xeon
8% slowdown running Xeon algorithm on the Niagara
Need a way to express these algorithmic choices to enable autotuning
Jason Ansel (MIT) PetaBricks April 4, 2011 8 / 30
![Page 20: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/20.jpg)
Outline
1 Motivating Example
2 PetaBricks Language Overview
3 Variable Accuracy
4 Autotuner
5 Results
6 Conclusions
Jason Ansel (MIT) PetaBricks April 4, 2011 9 / 30
![Page 21: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/21.jpg)
Algorithmic choices
Language
e i the r {I n s e r t i o n S o r t ( out , i n ) ;
} or {Q u i c k S o r t ( out , i n ) ;
} or {MergeSort ( out , i n ) ;
} or {R a d i x S o r t ( out , i n ) ;
}
⇒
Representation
Decision tree synthesized byour evolutionary algorithm(EA)
Jason Ansel (MIT) PetaBricks April 4, 2011 10 / 30
![Page 22: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/22.jpg)
Algorithmic choices
Language
e i the r {I n s e r t i o n S o r t ( out , i n ) ;
} or {Q u i c k S o r t ( out , i n ) ;
} or {MergeSort ( out , i n ) ;
} or {R a d i x S o r t ( out , i n ) ;
}
⇒
Representation
Decision tree synthesized byour evolutionary algorithm(EA)
Jason Ansel (MIT) PetaBricks April 4, 2011 10 / 30
![Page 23: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/23.jpg)
The PetaBricks language
Choices expressed in the language
High level algorithmic choicesDependency-based synthesized outer control flowParallelization strategy
Programs automatically adapt to their environment
Tuned using our bottom-up evaluation algorithmOffline autotuner or always-on online autotuner
Jason Ansel (MIT) PetaBricks April 4, 2011 11 / 30
![Page 24: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/24.jpg)
Outline
1 Motivating Example
2 PetaBricks Language Overview
3 Variable Accuracy
4 Autotuner
5 Results
6 Conclusions
Jason Ansel (MIT) PetaBricks April 4, 2011 12 / 30
![Page 25: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/25.jpg)
Variable accuracy algorithms
Many problems don’t have a single correct answer
Soft computing
Approximation algorithms for NP-hard problems
DSP algorithms
Different grid resolutionsData precisions
Iterative algorithms
Choosing convergence criteria
Jason Ansel (MIT) PetaBricks April 4, 2011 13 / 30
![Page 26: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/26.jpg)
Variable accuracy algorithms
Many problems don’t have a single correct answer
Soft computing
Approximation algorithms for NP-hard problems
DSP algorithms
Different grid resolutionsData precisions
Iterative algorithms
Choosing convergence criteria
Jason Ansel (MIT) PetaBricks April 4, 2011 13 / 30
![Page 27: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/27.jpg)
Variable accuracy example
Example
. . .f o r ( i n t i = 0 ; i < 1 0 0 ; ++i ) {
S O R I t e r a t i o n ( tmp ) ;}. . .
Competing objectives of performance and accuracy
Must maximize performance while meeting accuracy targets
Jason Ansel (MIT) PetaBricks April 4, 2011 14 / 30
![Page 28: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/28.jpg)
Variable accuracy example
Example
. . .f o r ( i n t i = 0 ; i < 1 0 0 ; ++i ) {
S O R I t e r a t i o n ( tmp ) ;}. . .
Competing objectives of performance and accuracy
Must maximize performance while meeting accuracy targets
Jason Ansel (MIT) PetaBricks April 4, 2011 14 / 30
![Page 29: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/29.jpg)
Accuracy metrics and for enough loops
Languageaccu racy met r i c MyRMSError
. . .fo r enough {
SORI t e r a t i on ( tmp ) ;}
⇒
Representation
Function from problem sizeto number of iterationssynthesized by our EA
Jason Ansel (MIT) PetaBricks April 4, 2011 15 / 30
![Page 30: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/30.jpg)
Accuracy metrics and for enough loops
Languageaccu racy met r i c MyRMSError
. . .fo r enough {
SORI t e r a t i on ( tmp ) ;}
⇒
Representation
Function from problem sizeto number of iterationssynthesized by our EA
Jason Ansel (MIT) PetaBricks April 4, 2011 15 / 30
![Page 31: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/31.jpg)
Accuracy metrics and for enough loops
Languageaccu racy met r i c MyRMSError
. . .fo r enough {
SORI t e r a t i on ( tmp ) ;}
⇒
Representation
Function from problem sizeto number of iterationssynthesized by our EA
Jason Ansel (MIT) PetaBricks April 4, 2011 15 / 30
![Page 32: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/32.jpg)
Accuracy variables
Languageaccu racy met r i c MyRMSErrora c c u r a c y v a r i a b l e k
. . .f o r ( i n t i =0; i<k ; ++i ) {
SORI t e r a t i on ( tmp ) ;}
⇒Representation
Function from problem sizeto k synthesized by our EA
Jason Ansel (MIT) PetaBricks April 4, 2011 16 / 30
![Page 33: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/33.jpg)
Accuracy variables
Languageaccu racy met r i c MyRMSErrora c c u r a c y v a r i a b l e k
. . .f o r ( i n t i =0; i<k ; ++i ) {
SORI t e r a t i on ( tmp ) ;}
⇒Representation
Function from problem sizeto k synthesized by our EA
Jason Ansel (MIT) PetaBricks April 4, 2011 16 / 30
![Page 34: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/34.jpg)
Variable accuracy and algorithmic choices
Languageaccu racy met r i c MyRMSError. . .e i t h e r {
fo r enough {SORI t e r a t i on ( tmp ) ;
}} or {
Mu l t i g r i d ( tmp ) ;} or {
D i r e c t S o l v e ( tmp ) ;}
Jason Ansel (MIT) PetaBricks April 4, 2011 17 / 30
![Page 35: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/35.jpg)
Outline
1 Motivating Example
2 PetaBricks Language Overview
3 Variable Accuracy
4 Autotuner
5 Results
6 Conclusions
Jason Ansel (MIT) PetaBricks April 4, 2011 18 / 30
![Page 36: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/36.jpg)
Traditional evolution algorithm
Initial population ? ? ? ? Cost = 0
Generation 2 Cost =
Generation 3 Cost =
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 37: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/37.jpg)
Traditional evolution algorithm
Initial population 72.7s ? ? ? Cost = 72.7
Generation 2 Cost =
Generation 3 Cost =
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 38: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/38.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s ? ? Cost = 83.2
Generation 2 Cost =
Generation 3 Cost =
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 39: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/39.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s ? Cost = 87.3
Generation 2 Cost =
Generation 3 Cost =
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 40: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/40.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 Cost =
Generation 3 Cost =
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 41: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/41.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 ? ? ? ? Cost = 0
Generation 3 Cost =
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 42: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/42.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1
Generation 3 Cost =
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 43: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/43.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1
Generation 3 ? ? ? ? Cost = 0
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 44: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/44.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1
Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0
Generation 4 Cost =
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 45: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/45.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1
Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0
Generation 4 ? ? ? ? Cost = 0
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 46: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/46.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1
Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0
Generation 4 0.3s 0.1s 0.4s 2.4s Cost = 3.2
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 47: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/47.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1
Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0
Generation 4 0.3s 0.1s 0.4s 2.4s Cost = 3.2
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 48: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/48.jpg)
Traditional evolution algorithm
Initial population 72.7s 10.5s 4.1s 31.2s Cost = 118.5
Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1
Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0
Generation 4 0.3s 0.1s 0.4s 2.4s Cost = 3.2
Cost of autotuning front-loaded in initial (unfit) population
We could speed up tuning if we start with a faster initial population
Key insight
Smaller input sizes can be used to form better initial population
Jason Ansel (MIT) PetaBricks April 4, 2011 19 / 30
![Page 49: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/49.jpg)
Bottom-up evolutionary algorithm
Train on input size 1, to form initial population for:
Train on input size 2, to form initial population for:
Train on input size 8, to form initial population for:
Train on input size 16, to form initial population for:
Train on input size 32, to form initial population for:
Train on input size 64
Naturally exploits optimal substructure of problems
Jason Ansel (MIT) PetaBricks April 4, 2011 20 / 30
![Page 50: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/50.jpg)
Bottom-up evolutionary algorithm
Train on input size 1, to form initial population for:
Train on input size 2, to form initial population for:
Train on input size 8, to form initial population for:
Train on input size 16, to form initial population for:
Train on input size 32, to form initial population for:
Train on input size 64
Naturally exploits optimal substructure of problems
Jason Ansel (MIT) PetaBricks April 4, 2011 20 / 30
![Page 51: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/51.jpg)
Bottom-up evolutionary algorithm
Train on input size 1, to form initial population for:
Train on input size 2, to form initial population for:
Train on input size 8, to form initial population for:
Train on input size 16, to form initial population for:
Train on input size 32, to form initial population for:
Train on input size 64
Naturally exploits optimal substructure of problems
Jason Ansel (MIT) PetaBricks April 4, 2011 20 / 30
![Page 52: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/52.jpg)
Bottom-up evolutionary algorithm
Train on input size 1, to form initial population for:
Train on input size 2, to form initial population for:
Train on input size 8, to form initial population for:
Train on input size 16, to form initial population for:
Train on input size 32, to form initial population for:
Train on input size 64
Naturally exploits optimal substructure of problems
Jason Ansel (MIT) PetaBricks April 4, 2011 20 / 30
![Page 53: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/53.jpg)
Bottom-up evolutionary algorithm
Train on input size 1, to form initial population for:
Train on input size 2, to form initial population for:
Train on input size 8, to form initial population for:
Train on input size 16, to form initial population for:
Train on input size 32, to form initial population for:
Train on input size 64
Naturally exploits optimal substructure of problems
Jason Ansel (MIT) PetaBricks April 4, 2011 20 / 30
![Page 54: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/54.jpg)
Bottom-up evolutionary algorithm
Train on input size 1, to form initial population for:
Train on input size 2, to form initial population for:
Train on input size 8, to form initial population for:
Train on input size 16, to form initial population for:
Train on input size 32, to form initial population for:
Train on input size 64
Naturally exploits optimal substructure of problems
Jason Ansel (MIT) PetaBricks April 4, 2011 20 / 30
![Page 55: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/55.jpg)
Bottom-up evolutionary algorithm
Train on input size 1, to form initial population for:
Train on input size 2, to form initial population for:
Train on input size 8, to form initial population for:
Train on input size 16, to form initial population for:
Train on input size 32, to form initial population for:
Train on input size 64
Naturally exploits optimal substructure of problems
Jason Ansel (MIT) PetaBricks April 4, 2011 20 / 30
![Page 56: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/56.jpg)
Variable accuracy autotuner
Generation i
⇒ Generation i + 1
Partition accuracy space into discrete levels
Prune population to have a fixed number of representatives from eachlevel
Jason Ansel (MIT) PetaBricks April 4, 2011 21 / 30
![Page 57: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/57.jpg)
Variable accuracy autotuner
Generation i
⇒ Generation i + 1
Partition accuracy space into discrete levels
Prune population to have a fixed number of representatives from eachlevel
Jason Ansel (MIT) PetaBricks April 4, 2011 21 / 30
![Page 58: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/58.jpg)
Variable accuracy autotuner
Generation i
⇒ Generation i + 1
Partition accuracy space into discrete levels
Prune population to have a fixed number of representatives from eachlevel
Jason Ansel (MIT) PetaBricks April 4, 2011 21 / 30
![Page 59: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/59.jpg)
Variable accuracy autotuner
Generation i
⇒ Generation i + 1
Partition accuracy space into discrete levels
Prune population to have a fixed number of representatives from eachlevel
Jason Ansel (MIT) PetaBricks April 4, 2011 21 / 30
![Page 60: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/60.jpg)
Variable accuracy autotuner
Generation i
⇒ Generation i + 1
Partition accuracy space into discrete levels
Prune population to have a fixed number of representatives from eachlevel
Jason Ansel (MIT) PetaBricks April 4, 2011 21 / 30
![Page 61: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/61.jpg)
Variable accuracy autotuner
Generation i
⇒
Generation i + 1
Partition accuracy space into discrete levels
Prune population to have a fixed number of representatives from eachlevel
Jason Ansel (MIT) PetaBricks April 4, 2011 21 / 30
![Page 62: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/62.jpg)
Variable accuracy autotuner
Generation i
⇒Generation i + 1
Partition accuracy space into discrete levels
Prune population to have a fixed number of representatives from eachlevel
Jason Ansel (MIT) PetaBricks April 4, 2011 21 / 30
![Page 63: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/63.jpg)
Outline
1 Motivating Example
2 PetaBricks Language Overview
3 Variable Accuracy
4 Autotuner
5 Results
6 Conclusions
Jason Ansel (MIT) PetaBricks April 4, 2011 22 / 30
![Page 64: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/64.jpg)
Changing accuracy requirements
1
2
4
8
16
32
10 100 1000 10000
Spe
edup
(x)
Input Size
Accuracy Level 2.0Accuracy Level 1.5Accuracy Level 1.0Accuracy Level 0.8Accuracy Level 0.6Accuracy Level 0.3
Image Compression
Jason Ansel (MIT) PetaBricks April 4, 2011 23 / 30
![Page 65: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/65.jpg)
Changing accuracy requirements
1
2
4
8
10 100 1000
Spe
edup
(x)
Input Size
Accuracy Level 0.95Accuracy Level 0.75Accuracy Level 0.50Accuracy Level 0.20Accuracy Level 0.10Accuracy Level 0.05
Clustering
1
10
100
1000
10000
10 100 1000 10000 100000 1e+06
Spe
edup
(x)
Input Size
Accuracy Level 1.01Accuracy Level 1.1 Accuracy Level 1.2 Accuracy Level 1.3 Accuracy Level 1.4
Bin Packing
1
2
4
8
16
32
10 100 1000 10000
Spe
edup
(x)
Input Size
Accuracy Level 2.0Accuracy Level 1.5Accuracy Level 1.0Accuracy Level 0.8Accuracy Level 0.6Accuracy Level 0.3
Image Compression
1
2
4
8
16
32
10 100 1000 10000 100000
Spe
edup
(x)
Input Size
Accuracy Level 109
Accuracy Level 107
Accuracy Level 105
Accuracy Level 103
Accuracy Level 101
3D Helmholtz
1
2
4
8
16
32
64
100 1000 10000 100000 1e+06
Spe
edup
(x)
Input Size
Accuracy Level 109
Accuracy Level 107
Accuracy Level 105
Accuracy Level 103
Accuracy Level 101
2D Poisson
1
2
4
8
10 100 1000 10000
Spe
edup
(x)
Input Size
Accuracy Level 3.0Accuracy Level 2.0Accuracy Level 1.5Accuracy Level 1.0Accuracy Level 0.5Accuracy Level 0.0
Preconditioner
Jason Ansel (MIT) PetaBricks April 4, 2011 24 / 30
![Page 66: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/66.jpg)
Multigrid choice space
Pseudo codeaccu racy met r i c MyRMSError. . .e i t h e r {
fo r enough {SORI t e r a t i on ( tmp ) ;
}} or {
Mu l t i g r i d ( tmp ) ;} or {
D i r e c t S o l v e ( tmp ) ;}
Jason Ansel (MIT) PetaBricks April 4, 2011 25 / 30
![Page 67: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/67.jpg)
Multigrid choice space
Pseudo codeaccu racy met r i c MyRMSError. . .e i t h e r {
fo r enough {SORI t e r a t i on ( tmp ) ;
}} or {
Mu l t i g r i d ( tmp ) ;} or {
D i r e c t S o l v e ( tmp ) ;}
SOR Ite ra tion
Time
Jason Ansel (MIT) PetaBricks April 4, 2011 25 / 30
![Page 68: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/68.jpg)
Multigrid choice space
Pseudo codeaccu racy met r i c MyRMSError. . .e i t h e r {
fo r enough {SORI t e r a t i on ( tmp ) ;
}} or {
Mu l t i g r i d ( tmp ) ;} or {
D i r e c t S o l v e ( tmp ) ;}
Grid
Siz
e128
SOR Iteration
Time
64
32
16
Jason Ansel (MIT) PetaBricks April 4, 2011 25 / 30
![Page 69: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/69.jpg)
Multigrid choice space
Pseudo codeaccu racy met r i c MyRMSError. . .e i t h e r {
fo r enough {SORI t e r a t i on ( tmp ) ;
}} or {
Mu l t i g r i d ( tmp ) ;} or {
D i r e c t S o l v e ( tmp ) ;}
Grid
Siz
e128
SOR Iteration
Time
64
32
16
Jason Ansel (MIT) PetaBricks April 4, 2011 25 / 30
![Page 70: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/70.jpg)
Multigrid choice space
Pseudo codeaccu racy met r i c MyRMSError. . .e i t h e r {
fo r enough {SORI t e r a t i on ( tmp ) ;
}} or {
Mu l t i g r i d ( tmp ) ;} or {
D i r e c t S o l v e ( tmp ) ;}
Grid
Siz
e128
SOR Iteration
Time
64
32
16
Direct Solve
Jason Ansel (MIT) PetaBricks April 4, 2011 25 / 30
![Page 71: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/71.jpg)
Autotuned V-cycle shapes
101
Gri
d S
ize
2048
1024
512
256
128
64
32
16
Jason Ansel (MIT) PetaBricks April 4, 2011 26 / 30
![Page 72: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/72.jpg)
Autotuned V-cycle shapes
101
Gri
d S
ize
2048
1024
512
256
128
64
32
16
103
Jason Ansel (MIT) PetaBricks April 4, 2011 26 / 30
![Page 73: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/73.jpg)
Autotuned V-cycle shapes
101
Gri
d S
ize
2048
1024
512
256
128
64
32
16
103
105
Jason Ansel (MIT) PetaBricks April 4, 2011 26 / 30
![Page 74: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/74.jpg)
Autotuned V-cycle shapes
101
Gri
d S
ize
2048
1024
512
256
128
64
32
16
103
105
107
Gri
d S
ize
2048
1024
512
256
128
64
32
16
Jason Ansel (MIT) PetaBricks April 4, 2011 26 / 30
![Page 75: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/75.jpg)
Autotuned bin packing algorithms
Jason Ansel (MIT) PetaBricks April 4, 2011 27 / 30
![Page 76: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/76.jpg)
Outline
1 Motivating Example
2 PetaBricks Language Overview
3 Variable Accuracy
4 Autotuner
5 Results
6 Conclusions
Jason Ansel (MIT) PetaBricks April 4, 2011 28 / 30
![Page 77: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/77.jpg)
Conclusions
Motivating goal of PetaBricks
Make programs future-proof by allowing them to adapt to theirenvironment.
We can do better than hard coded constants!
Jason Ansel (MIT) PetaBricks April 4, 2011 29 / 30
![Page 78: Language and Compiler Support for Auto-Tuning Variable …people.csail.mit.edu/jansel/papers/slides-ansel-cgo2011.pdf · Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT)](https://reader036.fdocuments.in/reader036/viewer/2022071021/5fd55d744d7fd26d021e42f0/html5/thumbnails/78.jpg)
Thanks!
Questions?
http://projects.csail.mit.edu/petabricks/
Jason Ansel (MIT) PetaBricks April 4, 2011 30 / 30