Dynamic Data Center concept

35
Dynamically Creating Big Data Centers for the LHC Frank Würthwein Professor of Physics University of California San Diego September 25th, 2013

description

Dr. Frank Wuerthwein from the University of California at San Diego presentation at International Super Computing Conference on Big Data, 2013, US Until recently, the large CERN experiments, ATLAS and CMS, owned and controlled the computing infrastructure they operated on in the US, and accessed data only when it was locally available on the hardware they operated. However, Würthwein explains, with data-taking rates set to increase dramatically by the end of LS1 in 2015, the current operational model is no longer viable to satisfy peak processing needs. Instead, he argues, large-scale processing centers need to be created dynamically to cope with spikes in demand. To this end, Würthwein and colleagues carried out a successful proof-of-concept study, in which the Gordon Supercomputer at the San Diego Supercomputer Center was dynamically and seamlessly integrated into the CMS production system to process a 125-terabyte data set.

Transcript of Dynamic Data Center concept

Page 1: Dynamic Data Center concept

Dynamically Creating Big Data Centers for the LHC

Frank Würthwein

Professor of Physics

University of California San Diego September 25th, 2013

Page 2: Dynamic Data Center concept

Outline

•  The Science •  Software & Computing Challenges •  Present Solutions •  Future Solutions

September 25th 2013 Frank Wurthwein - ISC Big Data 2

Page 3: Dynamic Data Center concept

The Science

Page 4: Dynamic Data Center concept

~67% of energy is “dark energy”

~29% of matter is “dark matter”

All of what we know makes up Only about 4% of the universe.

We have some ideas but no proof of what this is!

We got no clue what this is.

The Universe is a strange place!

September 25th 2013 Frank Wurthwein - ISC Big Data 4

Page 5: Dynamic Data Center concept

To study Dark Matter we need to

create it in the laboratory

September 25th 2013 Frank Wurthwein - ISC Big Data 5

Mont Blanc

Lake Geneva

ALICE

ATLAS

LHCb

CMS

Page 6: Dynamic Data Center concept
Page 7: Dynamic Data Center concept

“Big bang” in the laboratory •  We gain insight by colliding particles at the highest

energies possible to measure: –  Production rates –  Masses & lifetimes –  Decay rates

•  From this we derive the “spectroscopy” as well as the “dynamics” of elementary particles.

•  Progress is made by going to higher energies and brighter beams.

September 25th 2013 Frank Wurthwein - ISC Big Data 7

Page 8: Dynamic Data Center concept

Explore Nature over 15 Orders of magnitude Perfect agreement between Theory & Experiment

[GeV/c]T

Jet p30 40 100 200 1000 2000

GeV

/cpb

dy Tdp

σ2 d

-510

-310

-110

10

310

510

710

910

1110

1310 = 8 TeV CMS Preliminaryspp

21

(low PU runs)-1 = 5.8 pbint

open: L (high PU runs)-1 = 10.71 fbintfilled: L

NP ⊗NNPDF 2.1 NLO

)5 10×0.0 <|y|< 0.5 ( )4 10×0.5 <|y|< 1.0 ( )3 10×1.0 <|y|< 1.5 ( )2 10×1.5 <|y|< 2.0 ( )1 10×2.0 <|y|< 2.5 ( )0 10×2.5 <|y|< 3.0 ( )-1 10×3.2 <|y|< 4.7 (

)5 10×0.0 <|y|< 0.5 ( )4 10×0.5 <|y|< 1.0 ( )3 10×1.0 <|y|< 1.5 ( )2 10×1.5 <|y|< 2.0 ( )1 10×2.0 <|y|< 2.5 ( )0 10×2.5 <|y|< 3.0 ( )-1 10×3.2 <|y|< 4.7 (

Dark Matter expected somewhere below this line.

September 25th 2013 Frank Wurthwein - ISC Big Data 8

Page 9: Dynamic Data Center concept

And for the Sci-Fi Buffs … Imagine our 3D world to be confined to a 3D surface in a 4D universe.

Imagine this surface to be curved such that the 4th D distance is short for locations light years away in 3D.

Imagine space travel by tunneling through the 4th D.

The LHC is searching for evidence of a 4th dimension of space.

September 25th 2013 Frank Wurthwein - ISC Big Data 9

Page 10: Dynamic Data Center concept

Recap so far …

•  The beams cross in the ATLAS and CMS detectors at a rate of 20MHz

•  Each crossing contains ~10 collisions •  We are looking for rare events that are

expected to occur in roughly 1/10000000000000 collisions, or less.

September 25th 2013 Frank Wurthwein - ISC Big Data 10

Page 11: Dynamic Data Center concept

Software & Computing Challenges

Page 12: Dynamic Data Center concept

The CMS Experiment

Page 13: Dynamic Data Center concept

The CMS Experiment •  80 Million electronic channels

x 4 bytes x 40MHz ----------------------- ~ 10 Petabytes/sec of information x 1/1000 zero-suppression x 1/100,000 online event filtering ------------------------ ~ 100-1000 Megabytes/sec raw data to tape 1 to 10 Petabytes of raw data per year written to tape, not counting simulations.

•  2000 Scientists (1200 Ph.D. in physics) –  ~ 180 Institutions –  ~ 40 countries

•  12,500 tons, 21m long, 16m diameter

September 25th 2013 Frank Wurthwein - ISC Big Data 13

Page 14: Dynamic Data Center concept

Active Scientists in CMS

September 25th 2013 Frank Wurthwein - ISC Big Data 14

In Figure 9, the Tier-2 CPU planning, broken down into all contributing workflows, and the corresponding measured CPU utilization are overlapped, showing a good overall agreement; the Tier-2 CPU pledge is explicitly shown (orange line). It should be noted that the utilization curve (in red) does not include the CERN contribution (used as analysis facility since LS1). The slight deficit in pledge utilization at Tier-2 centers since in the first half of 2013 are mainly due to the lack of simulation requests, yet the level of Tier-2 usage for data analysis stayed high after the end of LHC Run 1.

The Tier-2 sites continue to be very successfully used for analysis and have been the primary analysis resource. The number of individual submitters per week submitting jobs to the Tier-2 sites with the CMS CRAB tool is shown in Figure 10. The main dips are explained by the CERN Christmas breaks, while the main peaks appear during preparation periods for summer and winter conferences.

Figure 10: Individual analysis submitters per week to the grid from Sep. 2009 to Aug. 2013.

The average total number of individual submitters per month since the begging of 2013 reaches 540, which means in a typical 30-day period days around 18% of the collaboration has submitted a grid job. This is only a 10% decrease in number of active users compared to the LHC running period, showing the user activity on the distributed GRID facilities is largely decoupled from the actual data taking.

The average number of job slots used at Tier-2 sites since the beginning of 2013 was of 37 K, as shown on the left hand side of Figure 11. This is a 12% increase compared to 2012, however given the 23% pledge increase at the same time, the overall pledge utilization so far in 2013 has been smaller than in 2012, as confirmed by Figure 8. The right hand side of Figure 11 shows the completed analysis jobs at Tier-2 in the first half of 2013, with a measured average of ~1.4 M jobs per week (200 K jobs per day).

8

5-40% of the scientific members are actively doing large scale data analysis in

any given week.

~1/4 of the collaboration, scientists and engineers,

contributed to the common source code of ~3.6M C++ SLOC.

Page 15: Dynamic Data Center concept

Evolution of LHC Science Program

150Hz 1000Hz 10000Hz Event Rate written to tape

September 25th 2013 Frank Wurthwein - ISC Big Data 15

LHC Roadmap

September 3, 2013.

Lint~75-100 fb-1

• Physics case

• Upgrade detector design

Page 16: Dynamic Data Center concept

The Challenge How do we organize the processing of 10’s to 1000’s of Petabytes of data by a globally distributed community

of scientists, and do so with manageable “change costs” for the next 20 years ?

Guiding Principles for Solutions Chose technical solutions that allow

computing resources as distributed as human resources. Support distributed ownership and control,

within a global single sign-on security context. Design for heterogeneity and adaptability.

September 25th 2013 Frank Wurthwein - ISC Big Data 16

Page 17: Dynamic Data Center concept

Present Solutions

Page 18: Dynamic Data Center concept

September 25th 2013 Frank Wurthwein - ISC Big Data 18

Federation of National Infrastructures. In the U.S.A.: Open Science Grid

Page 19: Dynamic Data Center concept

September 25th 2013 Frank Wurthwein - ISC Big Data 19

Among the top 500 supercomputers there are only two that are bigger when

measured by power consumption.

Page 20: Dynamic Data Center concept

Tier-3 Centers •  Locally controlled resources not pledged to any of

the 4 collaborations. –  Large clusters at major research Universities that are time

shared. –  Small clusters inside departments and individual research

groups. •  Requires global sign-on system to be open for

dynamically adding resources. –  Easy to support APIs –  Easy to work around unsupported APIs

September 25th 2013 Frank Wurthwein - ISC Big Data 20

Page 21: Dynamic Data Center concept

Me -- My friends -- The grid/cloud

O(104) Users

O(102-3) Sites

O(101-2) VOs

Thin client

Thin “Grid API”

Thick VO Middleware & Support

Me

My friends

The anonymous Grid or Cloud

Domain science specific Common to all sciences and industry

September 25th 2013 Frank Wurthwein - ISC Big Data 21

Page 22: Dynamic Data Center concept

“My Friends” Services

•  Dynamic Resource provisioning •  Workload management

– schedule resource, establish runtime environment, execute workload, handle results, clean up

•  Data distribution and access –  Input, output, and relevant metadata

•  File catalogue

September 25th 2013 Frank Wurthwein - ISC Big Data 22

Page 23: Dynamic Data Center concept

!"#$%&'!&()"&*+*,-.&/&-)&0/&

!"#$%&"'#()*#+)",#-.//&-0."*#.1#.23&-'*#45'(#+)",#-.+6."&"'*7#!"#$%&$)11*$2),3 &4&!"#$%/5&!"#$%65&7&!"#$%8-&9 &#&(*:&/;;.&'*"&*+*,-&

<*-&$)11*$2),3 & &4&<*-/5&<*-65&7&<*-8=&9 & & &#&(*:&/;.&'*"&*+*,-&

01*$-"),&$)11*$2),3&4&01*$-"),/5&7&9& & & &#&$)>'1*5&?(&#-&#11&

!"#$%&@5&*+A&/&-)&0/&!"#$%&'B5&*+A&/&-)&0/&

<*-&0C#D"),?$5&*+A&/&-)&0/& 01*$-"),&00E5&*+A&/&-)&0/&

7&

7&7& $1>.-*" /

&

!"#$%&'!&()"&*+*,-.&0/&-)&06& !"#$%&@5&*+A&0/&-)&06&!"#$%&'B5&*+A&0/&-)&06&

<*-&0C#D"),?$5&*+A&0/&-)&06& 01*$-"),&00E5&*+A&0/&-)&06&

7&

7&7& $1>.-*" 6

&

•  0#$C&FG#.%*-H&$)I'"*..*D&.*'#"#-*1J&KL&)'2I?B*D&-)&*M$?*,-1J&"*#D3&

–  '#"2#1&*+*,-5&*ANA5&),1J&!"#$%.O&

–  '#"2#1&)G=*$-5&*ANA5&),1J&01*$-"),&00E&-)&D*$?D*&?(&-C*&*+*,-&?.&?,-*"*.2,N&#-&#11&&

•  P)"&QER5&$1>.-*"&.?B*&),&D?.%&?.&S&T&6;&EU&)"&/;&T&V;&*+*,-.&

•  !)-#1&W1*&.?B*&(")I&/;;&EU&-)&/;&XU&

FU#.%*-H&

Q1>.-*".&)(&0+*,-.&

Optimize Data Structure for Partial Reads

September 25th 2013 Frank Wurthwein - ISC Big Data 23

Page 24: Dynamic Data Center concept

Fraction of file read [%]0 0.2 0.4 0.6 0.8 1

N

410

510

610

710

Fraction of a file that is read

September 25th 2013 Frank Wurthwein - ISC Big Data 24

# of

file

s re

ad

For vast majority of files, less than 20% of the file is read.

20%

Average 20-35% Median 3-7%

(depending on type of file)

Overflow bin

Page 25: Dynamic Data Center concept

Future Solutions

Page 26: Dynamic Data Center concept

From present to future •  Initially, we operated a largely static system.

–  Data was placed quasi-static before it can be analyzed. –  Analysis centers have contractual agreements with the collaboration. –  All reconstruction is done at centers with custodial archives.

•  Increasingly, we have too much data to afford this. –  Dynamic data placement

•  Data is placed at T2s based on job backlog in global queues. –  WAN access: ”Any Data, Anytime, Anywhere”

•  Jobs are started on the same continent as the data instead of the same cluster attached to the data.

–  Dynamic creation of data processing centers •  Tier-1 hardware bought to satisfy steady state needs instead of peak needs. •  Primary processing as data comes off the detector => steady state •  Annual Reprocessing of accumulated data => peak needs

September 25th 2013 Frank Wurthwein - ISC Big Data 26

Page 27: Dynamic Data Center concept

Any Data, Anytime, Anywhere

September 25th 2013 Frank Wurthwein - ISC Big Data 27

Site A Site B Site C

Global Xrootd Redirector

Xrootd Xrootd Xrootd

Lustre Storage Hadoop Storage dCache Storage

User Application Q: Open /store/foo

A: Check Site A

Q: Open /store/fooA: Success!

Cmsd Cmsd Cmsd

Xrootd Cmsd

Global redirection system to unify all CMS data into one globally accessible namespace.

Is made possible by paying careful attention to IO layer to avoid inefficiencies due to IO related latencies.

Page 28: Dynamic Data Center concept

Tape Archive!@ FNAL!

Tier-2 Centers!@ OSG!

Steady State!Processing!

@ FNAL!

Peak!Processing!

@ SDSC!

Cloud and/or OSG!Resources!

Simulated!

Data!

Vision going forward

Implemented vision for 1st time in Spring 2013 using Gordon Supercomputer at SDSC.

September 25th 2013 Frank Wurthwein - ISC Big Data 28

Page 29: Dynamic Data Center concept

September 25th 2013 Frank Wurthwein - ISC Big Data 29 SAN DIEGO SUPERCOMPUTER CENTER

Gordon Overview!

•  !"#$%&'(#•  "')*#&)+*#,"-#

•  ./0#12#34(564&4#789#:%;4(#

•  <1#=%&40#/>#?@8:%;4#•  /#A27#=%:5&%**4&(#•  <.#22"(#•  "')*#<B?CD#•  2'E4&F+=&%#6%C%#•  GH7#?4:1#

•  !BB#?@#7:54*#I<B#4FAH#22"(#

•  !BB#$@#)JJ&4J)54#

•  <0B1/#12#K4%:#DL#M2):;N#@&+;J4O#:%;4(#

•  <.#=%&4(0#./#?@8:%;4#•  7:54*#P4Q4&(%:#G)((#6%C%#

•  GH7#?4:!#

•  A)&J4#F46%&N#R2FG#2'E4&:%;4(#

•  1$@#"-SF#•  <B#$@#T*)(U#

V")5)#9)(+(W#A'(5&4#GT2#<BB#?@8(4=0#/#G@#

SAN DIEGO SUPERCOMPUTER CENTER

Using Gordon toAccelerate LHC Science"

!Rick Wagner!

San Diego Supercomputer Center!XSEDE 13"

July 22-25, 2013"San Diego, CA"

!Brian Bockelman!University of Nebraska-Lincoln!

Page 30: Dynamic Data Center concept

CMS “My Friends” Stack •  CMSSW release environment

–  NFS exported from Gordon IO nodes –  Future: CernVM-FS via Squid caches

•  J. Blomer et al.; 2012 J. Phys.: Conf. Ser. 396 052013

•  Security Context (CA certs, CRLs) via OSG worker node client •  CMS calibration data access via FroNTier

•  B. Blumenfeld et al; 2008 J. Phys.: Conf. Ser. 119 072007 –  Squid caches installed on Gordon IO nodes

•  glideinWMS •  I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950

–  Implements “late binding” provisioning of CPU and job scheduling –  Submits pilots to Gordon via BOSCO (GSI-SSH)

•  WMAgent to manage CMS workloads •  PhEDEx data transfer management

–  Uses SRM and gridftp September 25th 2013 Frank Wurthwein - ISC Big Data 30

Job environment

Data and Job handling

Page 31: Dynamic Data Center concept

CMS “My Friends” Stack •  CMSSW release environment

–  NFS exported from Gordon IO nodes –  Future: CernVM-FS via Squid caches

•  J. Blomer et al.; 2012 J. Phys.: Conf. Ser. 396 052013

•  Security Context (CA certs, CRLs) via OSG worker node client •  CMS calibration data access via FroNTier

•  B. Blumenfeld et al; 2008 J. Phys.: Conf. Ser. 119 072007 –  Squid caches installed on Gordon IO nodes

•  glideinWMS •  I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950

–  Implements “late binding” provisioning of CPU and job scheduling –  Submits pilots to Gordon via BOSCO (GSI-SSH)

•  WMAgent to manage CMS workloads •  PhEDEx data transfer management

–  Uses SRM and gridftp September 25th 2013 Frank Wurthwein - ISC Big Data 31

Job environment

Data and Job handling

This is clearly mighty complex !!!

So let’s focus only on the parts that are specific to incorporating

Gordon as a dynamic data processing center.

Page 32: Dynamic Data Center concept

September 25th 2013 Frank Wurthwein - ISC Big Data 32 SAN DIEGO SUPERCOMPUTER CENTER

Items in red were deployed/modified to incorporate Gordon

Minor mod of PhEDEx config file

Deploy Squid Export CMSSW

& WN client

Page 33: Dynamic Data Center concept

Gordon Results

•  Work completed in February/March 2013 as a result of a “lunch conversation” between SDSC & US-CMS management –  Dynamically responding to an opportunity

•  400 Million RAW events processed –  125 TB in and ~150 TB out –  ~2 Million core hours of processing

•  Extremely useful for both science results as well as proof of principle in software & computing.

September 25th 2013 Frank Wurthwein - ISC Big Data 33

Page 34: Dynamic Data Center concept

Summary & Conclusions

•  Guided by the principles: – Support distributed ownership and control in a

global single sign-on security context. – Design for heterogeneity and adaptability

•  The LHC experiments very successfully developed and implemented a set of new concepts to deal with BigData.

September 25th 2013 Frank Wurthwein - ISC Big Data 34

Page 35: Dynamic Data Center concept

Outlook •  The LHC experiments had to largely invent an

island of BigData technologies with limited interactions with industry and other domain sciences.

•  Is it worth building bridges to other islands ? –  IO stack and HDF5 ? – MapReduce ? – What else ?

•  Is there a mainland emerging that is not just another island ?

September 25th 2013 Frank Wurthwein - ISC Big Data 35