Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p...
Transcript of Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p...
![Page 1: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/1.jpg)
Multithreaded algorithms9/28/20
![Page 2: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/2.jpg)
Administrivia
• HW 2 due Monday night
• For Wednesday, finish reading section 27.1
![Page 3: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/3.jpg)
Big Picture
• Essentially every computer is now multicore• Means it can run multiple parts of a program at the same time
• Threads• Main abstraction for shared memory computing
![Page 4: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/4.jpg)
Recall: Naïve Fibonacci implementation
Fib(n)if(n <= 1)
return nelse
x = Fib(n-1)y = Fib(n-2)return x + y
![Page 5: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/5.jpg)
Parallel version
P_Fib(n)if(n <= 1)
return nelse
x = spawn P_Fib(n-1)y = P_Fib(n-2)syncreturn x + y
Allows child procedure to run in parallel with its parent
Causes parent to wait for all children to complete
![Page 6: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/6.jpg)
Parallel version
P_Fib(n)if(n <= 1)
return nelse
x = spawn P_Fib(n-1)y = P_Fib(n-2)syncreturn x + y
Can represent runtime behavior using a DAG (directed acyclic graph)
Vertices are strands (instructions w/o spawn or sync)
Edges show precedence constraints
![Page 7: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/7.jpg)
Parallel version
P_Fib(n)if(n <= 1)
return nelse
x = spawn P_Fib(n-1)y = P_Fib(n-2)syncreturn x + y
Can represent runtime behavior using a DAG (directed acyclic graph)
Vertices are strands (instructions w/o spawn or sync)
Edges show precedence constraints
Metrics:Work T1 = time on one processorSpan T∞ = length of longest path
![Page 8: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/8.jpg)
Which of the following is a lower bound on Tp, the time on p processors?
A. T1 + p
B. T1 - p
C. p T1
D. T1 / p
E. Not exactly one of the above
![Page 9: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/9.jpg)
Which of the following is a lower bound on Tp, the time on p processors?
A. T1 + p
B. T1 - p
C. p T1
D. T1 / p (called the work law)
E. Not exactly one of the above
![Page 10: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/10.jpg)
Which of the following is a lower bound on Tp, the time on p processors?
A. T∞
B. T∞ - p
C. pT∞
D. T∞ / p
E. Not exactly one of the above
![Page 11: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/11.jpg)
Which of the following is a lower bound on Tp, the time on p processors?
A. T∞ (called the span law)
B. T∞ - p
C. pT∞
D. T∞ / p
E. Not exactly one of the above
![Page 12: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/12.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
![Page 13: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/13.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Corollary: The running time of any such scheduler is within a factor of 2 of optimal
![Page 14: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/14.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Corollary: The running time of any such scheduler is within a factor of 2 of optimal
Another metric: parallelism = T1/ T∞(intuitively, the number of processors we can use)
![Page 15: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/15.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 16: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/16.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 17: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/17.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 18: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/18.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 19: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/19.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 20: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/20.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 21: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/21.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 22: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/22.jpg)
Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time
Tp ⩽ T1/p + T∞
Proof: Charge each time step to one of the terms.Charge complete steps (when all processors are busy) to T1/p.Suppose to contrary that every processor is busy > ⌊T1/p⌋ time steps.Then these steps do work ≥ p(⌊T1/p⌋ + 1) = p ⌊T1/p⌋ + p
= T1 – (T1 mod p) + p> T1
Charge incomplete steps (at least one processor idle) to T∞.If a processor is idle, it must schedule every strand w/o incomplete prerequisites. Every path must start with one of these, so this shortens every critical path, which can happen at most T∞ times.
![Page 23: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/23.jpg)
Combining subcomputations
A BA
B
![Page 24: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/24.jpg)
Combining subcomputations
Work: T1(A) + T1(B) Work: T1(A) + T1(B)
A BA
B
![Page 25: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/25.jpg)
Combining subcomputations
Work: T1(A) + T1(B)
Span: T∞(A) + T ∞(B)
Work: T1(A) + T1(B)
Span: max{ T∞(A), T ∞(B) }
A BA
B
![Page 26: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/26.jpg)
What is the span of the following code?for(int i=0; i < n; i++)
spawn A[i] = B[i];syncA. ⍬(1)B. ⍬(log n)C. ⍬(n1/2)D. ⍬(n)E. None of the above
![Page 27: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/27.jpg)
What is the span of the following code?for(int i=0; i < n; i++)
spawn A[i] = B[i];syncA. ⍬(1)B. ⍬(log n)C. ⍬(n1/2)D. ⍬(n)E. None of the above
![Page 28: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/28.jpg)
Alternate idea of a for loop
void do_it(int s, int e) {if(s == e)
A[s] = B[s]else {
spawn do_it(s, (s+e)/2)do_it((s+e)/2+1, e)sync
}}…do_it(1,n)
![Page 29: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/29.jpg)
What recurrence gives the work of this?
void do_it(int s, int e) {if(s == e)
A[s] = B[s]else {
spawn do_it(s, (s+e)/2)do_it((s+e)/2+1, e)sync
}}…do_it(1,n)
(n = e – s + 1) A. T1(n) = T1(n/2) + 1B. T1(n) = T1(n/2) + nC. T1(n) = 2T1(n/2) + 1D. T1(n) = 2T1(n/2) + nE. None of the above
![Page 30: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/30.jpg)
What recurrence gives the work of this?
void do_it(int s, int e) {if(s == e)
A[s] = B[s]else {
spawn do_it(s, (s+e)/2)do_it((s+e)/2+1, e)sync
}}…do_it(1,n)
(n = e – s + 1) A. T1(n) = T1(n/2) + 1B. T1(n) = T1(n/2) + nC. T1(n) = 2T1(n/2) + 1D. T1(n) = 2T1(n/2) + nE. None of the above
![Page 31: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/31.jpg)
What is the work of this?
void do_it(int s, int e) {if(s == e)
A[s] = B[s]else {
spawn do_it(s, (s+e)/2)do_it((s+e)/2+1, e)sync
}}…do_it(1,n)
A. ⍬(1)B. ⍬(log n)C. ⍬(n)D. ⍬(n log n)E. None of the above
![Page 32: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/32.jpg)
What is the work of this?
void do_it(int s, int e) {if(s == e)
A[s] = B[s]else {
spawn do_it(s, (s+e)/2)do_it((s+e)/2+1, e)sync
}}…do_it(1,n)
A. ⍬(1)B. ⍬(log n)C. ⍬(n)D. ⍬(n log n)E. None of the above
![Page 33: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/33.jpg)
What is the span?
void do_it(int s, int e) {if(s == e)
A[s] = B[s]else {
spawn do_it(s, (s+e)/2)do_it((s+e)/2+1, e)sync
}}…do_it(1,n)
A. ⍬(1)B. ⍬(log n)C. ⍬(n0.5)D. ⍬(n)E. None of the above
![Page 34: Multithreaded - Knox Collegecourses.knox.edu/cs205/notes/Multithreaded.pdf · Th: With p processors, any scheduler that is never voluntarily idle completes a computation in time T](https://reader033.fdocuments.in/reader033/viewer/2022052100/6039c2f9b14240584221c5c6/html5/thumbnails/34.jpg)
What is the span?
void do_it(int s, int e) {if(s == e)
A[s] = B[s]else {
spawn do_it(s, (s+e)/2)do_it((s+e)/2+1, e)sync
}}…do_it(1,n)
A. ⍬(1)B. ⍬(log n)C. ⍬(n0.5)D. ⍬(n)E. None of the above