Grand Challenge and PHENIX Report post-MDC2 studies of GC software –feasibility for day-1...

Grand Challenge and PHENIX

• Report post-MDC2 studies of GC software– feasibility for day-1 expectations of data model

– simple robustness tests

– Comparisons to data carousel … not yet

• GC meeting highlights, plans

• Possible additions/enhancements of GC software– HPSS savvy

– Distributed processing???

Cache use studies– Single long processing

time query• ~0.2MB/s

• Cache kept near 1GB

• processing continuous

GC Test conditions: Singles• Single Queries

– 100 events/file

– 100-200 MB/file

– Varying• cache size

• processing time

• query size

• HPSS status– 1 tape drive

– fast purge policy (5 minutes)• to isolate GC capabilities

– pftp of files generally from 4-8 MB/s• though total rate closer to 1-3 MB/s

– HPSS savvy should improve by factor of 2-3!

Query Size of Query Cache Processing Tape drives Total time (s) RateGB # files (GB) time/event (s) (MB/s)

rndm<1.0 37.44 188 2.0 0.001 1 28,260 1.36rndm<0.2 7.58 37 1.0 1 1 4,772 1.59rndm<0.2 7.58 37 1.0 5 1 37,000 0.20rndm<0.2 7.58 37 1.0 0.1 1 2,045 3.70rndm<0.2 7.58 37 3.0 0.1 1 1,818 4.17rndm<0.2 7.58 37 3.0 0.1 1 1,437 5.27

Multiple queries

• single query – 0.0<rndm<0.2

• time ~ 2000s

• overlapping queries:– 0.0<rndm<0.2

– 0.1<rndm<0.3• time ~ 3000s

• overlapping files staged to disk first

Test conditions: Doubles

• Double Queries– identical queries

• at same time

• delayed

• with different processing times

– overlapping queries

Query 1 Query 2 Size of Query 1 Size of Query 2 Cache Processing Processing Total time (s)GB # files GB # files (GB) time/event (s) time/event (s)

Query 1 Query 2

rndm<0.2 rndm<0.2 7.58 37 7.58 37 3.0 0.1 0.1 2,076rndm<0.2 sleep 1000 rndm<0.2 7.58 37 7.58 37 3.0 0.1 0.1 2,178rndm<0.2 0.1<rndm<0.3 7.58 37 7.61 37 3.0 0.1 0.1 2,925

Robustness

• Start query– normal 8 GB/37 files– ctrl-C after first 2 files

• 3rd file staged to cache

• stops cleanly

– start identical query– ctrl-C at 14 files

• 15th file staged to cache

• stops cleanly

– different query– etc. etc.– Very robust!– Troubles only when Objectivity

lockserver fails!!

GC meeting highlights

• Post MDC2 the GC commandeered HPSS and performed some tests:– robustness, correctness, tape drive dependencies, 1 P.I.P. link to user

code, etc.

• CAS plans. How does this affect/change GC capabilities.

• Interface of GC with physics analyses.

• Scalability issues -- tests to commence in July– 1000’s of files, 10,000 events/file, 5 components, 2TB total

• Generic interfaces– quest to make the GCA as independent of our specific problem as

possible -- usability for other HEP expts, climate modeling, combustion studies, etc.

First year offline configuration

• As best I can tell fundamentally different configuration than STAR:– Day1 (many sub-detectors):

• ~10 Detector sub-groups with their own files for calibration purposes

– Year 1 (many physics analyses): • 60 X rate of STAR -> smaller DST events (~100 kB/event)

– no physical separation of events into components (maybe hits???)» single component at least for year 1 (caveat on later slides)

• couple thousand events/file

• since any physics analysis will in general cut no tighter than 1%, unless we filter events to separate files according to physics cuts, every major physics analysis query will want every file.

– Prefiltering adds excessive complications and the need to correct for biases, etc.» possible exception: centrality presorting -- 4-5 bins» according to trigger conditions, detector configuration, date, etc.

Projected use of G.C.

• Intimately related to expectations of CAS machines– Day1 (many sub-detectors):

• Separate calibration/special run files

• Separate machines

• Different instances of G.C./D.C.

• Separate cache, etc.

– Year 1 (Physics Analyses): • some small/separate analyses

– on CAS machines/server

– but usually on micro-DST’s

• A few large jobs: need all/most files

• Running on CAS/server?

• Possibility of cache over several disks?– Distributed processing

– 10’s GB each

– Data spread out -- send analysis code to machine

HPSS

CAS

CAS

CAS CAS

HPSS

MultiCPU

BIGDISK

Partial Query Biases

• Possible troubles with partial queries if they introduce a physics bias

• However, only a problem if we presort our data in files according to physics signals

• May presort according to centrality in day-1/year-1.

Components?

• A couple possible ways to separate events into components

• Problems?– each component corresponds

to a separate file on tape• too many tape mounts?

Event

Hits TracksRaw Global

Event

Raw mDST1DST mDST2

Objectivity Woes

• Surgically remove Objectivity• “We strongly recommend against using an ODBMS in those applications that are

handled perfectly well with relational database(s)” -- Choosing an Object Database, Objectivity

– Lockserver problems• restarting often

– Movement of disk to rnfs01• rebuilding of objy tagDB

– Possible alternatives?• Root?

– Robustness: multiple accesses » each node of the farm and CAS machines accessing same file» layer between reconstruction nodes and DB?

– Scalability:» 100 GB tagDB? -- chain of files?

Carousel

“ORNL”software

carouselserver

mySQLdatabase

filelist

HPSStape

HPSScache

pftp

rmds03carousel

client

pftp CAS

rmds01 rmds02

“stateless”

Data Carousel: “Strip Mining”

• Written and tested by Jerome Lauret and Morrison

– layer in front of ORNL batch transfer software

• mostly in perl• administration• organization• throttling• integrate multiple user requests

for specific files (not events)• maintain disk FIFO staging area• 5-7 MB/s integrated rate!!

– G.C. is near 1-3 MB/s» tape optimization should

clean that up

• Missing:– Robust disk cache

– Interface to physics analyses

– error checking

– etc.

Interface with Physics Analysis

• Offline code:– DST’s written in ROOT

– tagDB also in ROOT (probably)• not OBJY -- no better alternative

• Interface to GC– return

• file number/event number

– capability to run query at root prompt?

– Possibility to return list of events when a bundle is cached?• To keep continual interaction with GC to a minimum

Grand Challenge and PHENIX Report post-MDC2 studies of GC software –feasibility for day-1...

Documents

Transcript of Grand Challenge and PHENIX Report post-MDC2 studies of GC software –feasibility for day-1...