8/12/2019 Parallel vs Distributed
1/18
Eugene Magnier Astro 734 : Lecture 09
Eugene MagnierAstronomy 734Spring 2006
Parallel and DistributedProcessing
8/12/2019 Parallel vs Distributed
2/18
Eugene Magnier Astro 734 : Lecture 09
Lecture !er!ie"
Moti!ations
understanding bottlenec#s
multiple$ processing or %&' (ypes o) Parallel and Distributed Processing
Multitas#ing !s Multit*reading
Multicomputer !s multiprocessor
Parallel processing !s distributed processing
MP% !s P+M ,ondor Pan,ontrol Pan(as#s
8/12/2019 Parallel vs Distributed
3/18
Eugene Magnier Astro 734 : Lecture 09
!er!ie" - Moti!ations
(*e Problem - processing one.at.a.time ta#es too long (*e Solution - do more t*an one at a time/
nderstanding your bottlenec#s- (oo muc* data or too muc* "or#' measure your processing speed
add timing points "it*in t*e code time complete1 representati!e obs
count your data %&s
is it local or net"or#' measure your %&s
time dd i))ile o)&de!&null
e$amine your t*roug*puts- seconds )or processing' seconds )or %&' compare ,P 5igacycles & sec to %& Megabytes & sec
8/12/2019 Parallel vs Distributed
4/18
Eugene Magnier Astro 734 : Lecture 09
Multitas#ing !s Multit*reading
(*e simplest parallel processing-
multiple obs on your o"n mac*ine
Multitas#ing separate programs
independent data
*andled by #ernel automatically Multit*reading
multiple realiations o) t*e sameprogram
s*ared memory
independent processing
re8uires care "it* memory andmessages
programs must be "ritten to usemultit*reading
chip 1 chip 2
collect results
chip 1 chip 2
read data
collect results
single program
multiple programs
8/12/2019 Parallel vs Distributed
5/18
Eugene Magnier A t 734 L t 09
8/12/2019 Parallel vs Distributed
6/18
Eugene Magnier Astro 734 : Lecture 09
Parallel processing !s Distributed processing
Distributed processing-
multiple obs "*ic* re8uire little or no intercommunication
Data is not s*ared bet"een distributed obs E$amples
large number o) indi!idual images *undreds o) distinct spectra data preparation
Parallel processing- multiple obs re8uire )re8uent communication
Data is *ea!ily s*ared bet"een obs
E$amples large >.body simulations )ull.s#y astrometric & p*otometric analysis !ery large matri$ in!ersion !ery large ??(s
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
7/18
Eugene Magnier Astro 734 : Lecture 09
MP% !s P+M
MP% - Message Passing %nter)ace Library
allo" distributed processes to
s*are data send messages bloc# )or messages *ig*ly e))icient )or message passing
P+M - Parallel +irtual Mac*ine
also pro!ides a message passing library
includes resource and process control layer
pro!ides a single point )or interactions bot* re8uire de!eloper to program to t*e model
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
8/18
Eugene Magnier Astro 734 : Lecture 09
,ondor
Layered on top o) P+M Pro!ides management o) distributed obs
obs don'tre8uire recompilation e$pects *eterogeneous cluster with machine owners! some"*at restricti!e on be*a!ior o) obs
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
9/18
Eugene Magnier Astro 734 : Lecture 09
Pan,ontrol
manages distributed obs li#e ,ondor manage mac*ines in pool
obs can re8uest or demand speci)ic mac*ines simple user inter)ace inter)aces "it* Pan(as#s
host add foo
host add bar
job program
job -host foo program
job +host bar program
job +host baz program
for i 0 100
sprintf input chip.%02d.fits
sprintf output chip.%02d.flat
job process $input $output
end
check job 0
stdout job 0stderr job 1
delete job 5
host off foo
host on foo
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
10/18
Eugene Magnier Astro 734 : Lecture 09
Pan(as#s
@egularly.sc*edule tas# e!aluation (as#s potentially spa"n obs obs may be local or parallel obs may be targetted to speci)ic mac*ines
task datalist
command ls /data/foo
periods -exec 5.0
periods -timeout 50.0 periods -poll 1.0
task.exit 0
queueprint stdout
queuedelete stdout
end
task.exit 1
queuepush failure "task failed" end
end
task datalist
periods -exec 5.0
periods -timeout 50.0
periods -poll 1.0
task.exec
$file = `next.file`
if ($file == "none")
break
end
command cp /data/foo/$file /data/bar
end
task.exit 0
queueprint stdout
queuedelete stdout
queuepush copied $file
end
task.exit 1
queuepush failure $file
end end
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
11/18
g g Astro 734 : Lecture 09
pcontrol ser!er,*ec#,ontrol
,*ec#(as#s
,*ec#,*ild
tas# 8ueue
ob 8ueue
user cmds readline
Pan(as#s - Process Loop & ser %nter)ace
ser issues commands1 loads scripts !ia readline ,lient & Ser!er Model designed1 not yet implemented
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
12/18
g g
run tas# prep
construct ob cmd
submit ob
c*ec# tas# timer
ne$t tas#
c*ec# ob timer
c*ec# ob status
return ob results
ne$t ob
Pan(as#s - (as# Loop !s ob Loop
Limited number o) tas#s & obs per interrupt cycle ma$imum o) one e!aluation per cycle tas# & ob 8ueues are continuously cycled
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
13/18
g g
idle
Pcontrol Bost Cueue States
ne"
do"n
done
o))
busy
delete
SE@- *ost .on name
SE@- *ost .delete name
SE@- *ost .o)) name SE@- *ost .o)) name
SE@- *ost name
SE@- *ost .o)) name
LP- Startob
Pan(as#s - Pcontrol Bost States
Pcontrol monitors *osts *osts may be added & deleted =do"n= *osts are automatically re.attac*ed ss* communication to t*e *osts
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
14/18
busy
ne"
pending done
e$it cras*
delete
:
22
P: (:
(: (2
*ung
Pan(as#s - Pcontrol ob States
Pcontrol mo!es obs and *osts in parallel sers may delete pending obs1 #ill running obs1 or *ar!est cras*&e$it obs< Pan(as#s sc*eduler is normal Pcontrol =user=
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
15/18
command srcsc*eduler
pcontrol client
readline
,*ec#,*ild
Pcontrol Ser!er
,*ec#,*ild
,*ec#,*ild
pcontrol client
pcontrol client
Pan(as#s - Pcontrol Process Loop & ser %nter)ace
ser issues commands1 loads scripts !ia readline !ery similar to Pan(as#s sc*eduler loop same code
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
16/18
command srcpcontrol
c*ild processreadline ,*ec#,*ild
Pclient
Pan(as#s - Pclient Process Loop & ser %nter)ace
ser issues commands Pclient launc*es1 monitors bac#ground c*ild process reports stdout1 stderr1 e$it status !ery simple command set
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
17/18
Pan(as#s - loading tests
Pclient - demonstrated ob rates o) 900 per second Pcontrol - tests to manage 62 nodes1 9:0 obs per second total rate Pan(as#s - sc*edule & *ar!est 90 obs per second @e8uirement - FF 640 obs & 4 seconds :4 per second
Eugene Magnier Astro 734 : Lecture 09
8/12/2019 Parallel vs Distributed
18/18
%PP computing - distributed processing
classical parallel eg MP% !s distributed processing increase total MB increase total %& rate =targeted= processing
(A >odes S#y >odes
Gig S"itc*
connection toobser!atory system
metadata dbser!er
pantas#sser!er
D+ser!er
Top Related