Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
Grid Job Management
-
Upload
pycontw -
Category
Technology
-
view
886 -
download
1
description
Transcript of Grid Job Management
![Page 1: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/1.jpg)
1
PyCon 2012Grid Job management
Felix Lee, ASGC
![Page 2: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/2.jpg)
2
About ASGCAcademia Sinica Grid & Cloud
![Page 3: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/3.jpg)
3
Something we might need to know..
• LHC• WLCG• Grid Computing
![Page 4: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/4.jpg)
4
LHC experiment• LHC – The Large Hadron Collider.
• It was built by European Organization for Nuclear Research (CERN)
• 27KM tunnel in circumference, as deep as 175M
![Page 5: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/5.jpg)
5
WLCG• World-wide LHC Computing Grid
• It's a distributed computing infrastructure to provide the production and analysis environment for LHC experiment.
• Currently, there are 11 tier1, 140 tier2 and several small tier3 in the world.
• There are 269299 CPU cores, 183PB storage capacity in the world.
![Page 6: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/6.jpg)
6
Grid Computing• It's one of distributed computing.• Base on federal resources.• It connects loosely-coupled computers by the
Internet to be super virtual computer.
![Page 7: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/7.jpg)
7
What we do
• ASGC is WLCG(World-wide LHC Computing Grid) Tier 1 operation center since 2005
• ASGC is also conducting Asia Pacific regional e-Science collaborations, development and infrastructure operation.
• Developing new generation distributed computing infrastructure and technologies.
![Page 8: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/8.jpg)
8
Python for us
![Page 9: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/9.jpg)
9
Python in WLCG & Grid
• It's widely used for high level integration.• Clear code, clear syntax...• Totally open source.• Fast and flexible implementing.
• It's script.
• No need to be complied.
• Plenty of mathematic and science modules.
![Page 10: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/10.jpg)
10
Python in WLCG & Grid
• Work flow & Job Management.• Data Management.• Information system.• Monitoring.• HEP applications
• Data processing.
• Data analysis.
![Page 11: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/11.jpg)
11
Computing system in WLCG/Grid
• They are all integrated/implemented by Python• WMAgent:
• Workload Manager Agent.
• GRAB:
• CMS Remote Analysis Builder.
• PanDA:
• Production and Distributed Analysis system.
• DIRAC:
• Distributed Infrastructure with Remote Agent Control
• AliEn:
• Alice Environment
• DIANE:
• Distributed Analysis Environment
![Page 12: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/12.jpg)
12
Python in ASGC
• Work flow & Job Management• GAP 1.0 (base on DIANE)
• PanDA, collaborating with Atlas
• Monitoring and information• GSTAT 2.0, Nagios plugin.
• Integration of Grid & Cloud.• Virtual worker node on demand.
• Virtual machine catalog service.
• Deployment and automation.
![Page 13: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/13.jpg)
13
GStat 2.0
![Page 14: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/14.jpg)
14
PanDAThe Integrated Grid Computing System
withPython
![Page 15: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/15.jpg)
15
Work flow & Job management
• A typical Grid workflow
![Page 16: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/16.jpg)
16
PanDA
• PanDA• Production and Distributed Analysis system.
• Designed and developed by Atlas experiment.
• It's data driven and pull model computing.
• Including workflow, resource matchmaking and job management.
• We are now working with Atlas to improve and deploy it for eScience users.
![Page 17: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/17.jpg)
17
PanDA diagram
![Page 18: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/18.jpg)
18
PanDA Server• PanDA server design
• Apache-based
• Communication via HTTP/HTTPs
• Multi-process
• Global info in the memory resident database
Python interpreter
Python interpreter
DB
DQ2
Client
Apache
Child process
HTTP/HTTPSMySQL API
![Page 19: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/19.jpg)
19
PanDA Client• PanDA client
• Pickle module of python and native curl.
• Client require python 2.3 or higher, curl and grid-proxy
• Simple, light-weight.
PyhonObj
PyhonObj
mod_python
mod_deflatePyhon
Obj
Client
PanDA
Serialize(cPlckle)
deserialize(cPlckle)
UserIFRequest(HTTPS)
Response(HTTPS)
![Page 20: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/20.jpg)
20
PanDA screen shot
![Page 21: Grid Job Management](https://reader033.fdocuments.in/reader033/viewer/2022060110/555a70f4d8b42ae7218b5346/html5/thumbnails/21.jpg)
Thanks for your [email protected]