HDF5 at the European XFEL · HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5...
Transcript of HDF5 at the European XFEL · HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5...
HDF5 at the European XFEL
Presenter:Dr. Gero FluckeControl and Analysis Software Group, European XFEL GmbH
Main author:Djelloul BoukhelefIT and Data Management Group, European XFEL GmbH
HDF5 Workshop @ ICALEPCS 2017Barcelona, October 8th, 2017
2HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Outline
Introduction: The European XFEL
Data acquisition - integrated into Karabo control system
Data organisation: file naming and data structure
HDF5 hardware compression studies
3HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Linear electron accelerator:2.1 km, starts at DESY
Magnetic structures (undulators):Stimulate coherent photon emission
Photon opticsTransport photon beam to six instruments
European XFEL: the most brilliant X-ray Free Electr on Laser
XFEL InstrumentsFXE Femtosecond X-ray Experiments *)SCS Spectroscopy & Coherent ScatteringSPB/SFX Single particles, Clusters and Biomolecules *) SQS Small Quantum Systems
HED High Energy Density ScienceMID Materials Imaging and Dynamics
*) “User assistedcommissioning”since Sept. 14, 2017
4HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
European XFEL: Photons come in trains with short pu lses
Pattern:
Trains with 10 Hz“Storage unit” for data acquisition
Up to 2700 pulses per trainpulse duration < 100 fs220 ns spacing (4.5 MHz)=> train length 600 µs
On average:up to 27 kHz of pulses
5HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Karabo: Experiment Control, Data Acquisition and (Fast-Feed back) Analysis
Karabo Framework:
Tight integration of applications for
Experiment control
Data Acquisition
Data Management
Data Analysis
DAQdata readoutonline processingquality monitoring (vetoing)
DAprocessing pipelinesdistributed and GPU
computingspecific algorithms
(e.g. reconstruction)
Controldrive hardware and complex experimentsmonitor variables & trigger alarms
DMstorage of experiment & control datadata access, authentication authorization etc.
setup computation & show scientific results
allow some control & show hardware status
show online data whilst running
Accelerator Undulator Beam Transport
DM DA
ControlDAQ
6HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Message Broker
Karabo Communication and Data Flow
Karabo Framework:
Broker based communication
Point-to-point shortcut possible
Data PipelinesFast, not too big dataAnalysis Pipelines
EquipmentControl
e.g. motor, pump, valve,
sensor
DAQEquipment
e.g. commercial camera
GuiServer
and other services,e.g. calibration manager,
run configurator
Analysis Node
e.g. image proessing
Data StorageNode
e.g. storage of data from runs
e.g. 2D-detectors (AGIPD, LPD, DSSC)
DAQEquipment
Composite(Middle Layer)
Device
e.g. complex detector motion control
Gui interface(s)
7HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Data acquisition and management
Data acquisition, handling and processing
Collect and store dataConsolidating fast/slow, small/large data streams►Data, tagged with train-id,
is received by software devices at 10Hz (train rate)
Data curation
Creation, storage, archiving Reliable retrieval of data in future
Data policy
Data retention/custodyAccess by users to scientific data
Metadata catalogue
User
8HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Karabo: Near-Realtime Data Analysis using non-Karabo tool
See presentation: „Data Analysis Support in Karabo at European XFEL“ by Hans Fangohrin parallel session on “Data Analytics”, Tuesday, Oct. 10, 14.00 h
9HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Data Sources and DAQ system
Devices continuously push data to the PC layerImage data: 2D detectors � big & fast � require multi channel transfer from single detector (non-Karabo protocol)2D or Pulse data: e.g. cameras, digitizers � fast � can aggregate several data sources (Karabo pipeline protocol)Control data: e.g. sensors, motors � slow � can aggregate many data sources (Karabo protocol)
Data Aggregators
Storage
Data Sources
10HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
How data arrives at the PC layer
Control data (“slow”) These are parameters of Karabo devices (e.g. motor positions)► PC layer aggregator uses the standard Karabo mechanism to
register for any update of the parameter value► Available through the Karabo broker service
Instrument dataData acquired and processed by electronic devices (“big and fast”)► Available as a binary object described by XFEL Train Data
Format (XTDF) ► sent out to PC layer using application level Train Transfer
Protocol (TTP) over UDP network protocolData acquired and sent out to PC layer by software device controlling the hardware via dedicated DAQ network (“fast”)► Available as Hash structure ► Sent out to PC layer using Karabo pipeline
Digitizer device
raw
processed
PC layer device
header
images
descriptor
detector
FEM Ctrl device
11HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
File naming conventionRaw data repository structure follows Metadata Catalogue model
File naming conventiontype-rXXXX.h5 or type-rXXXX-infix-sXXXXX.h5► r – run, s – sequence number, X – digit, h5 – hdf5 extension► type – can be raw , cal , proc (to be defined)► Infix – based on the aggregator/PC layer node► “-” – separator General concept exists for splitting data across many files► Certain data groups can be stored in separate files► Train data from different files can be correlated without additional catalogueFile name is “computed” based on several configuration parameters (T
0, Nc, N, infix) and
is function of T (train id)
Stub file :raw-r0001.h5
LPD detector data raw-r0001-lpd-s00000.h5
raw-r0001-lpd-s00001.h5
raw-r0001-lpd-s00002.h5
…
Facility Proposal Type RunFacilityCycle
XFEL p001234 raw r0002201701
Instrument
FXE
12HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Run data filesData files store datasets
Each file may contain data from one or multiple data sources or propertiesXFEL component naming convention is used
Stub fileUnique per runPoint to actual files that store datasetsIndicate formula parameters (e.g. HDF5 attributes) to calculate the actual set of filesthat store a given datasetDataset in different file groups might be indexed using different parameters
/detlab_det_lpd/2d/lpd1
/detlab_det_lpd/fem/1/
/detlab_det_alas1/vac/1
/detlab_det_alas1/motor/3
0001.h5
r0001-lpd-s0000.h5 r0001-lpd-s0001.h5 r0001-lpd-s0002.h5 r0001-lpd-s0003.h5
r0001-lpd-s0004.h5 r0001-lpd-s0005.h5
PCL0 PCL1 PCL2 PCL3
r0001-agg0-s0000.h5
AGG0
r0001-agg0-s0001.h5
r0001-agg1-s0000.h5
AGG1
r0001-agg1-s0001.h5
r0001-agg1-s0002.h5
r0001-agg1-s0003.h5
Stub file
13HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Control data sourcesOnly selection of control data is stored
One value stored for each train
Timestamp is preferably assigned at hardware level, otherwise as soon as control system sees the data
If data does not change, previous value is duplicated,but timestamp kept
Errors can be identified by special timestamp value i.e. zero
/control/index/trainId /control/detlab_det_lpd/fem/1/femVoltage0/timestamp
/control/detlab_det_lpd/fem/1/femVoltage0/value
/control/detlab_det_lpd/fem/1/powerCardTemp0/timestamp
/control/detlab_det_lpd/fem/1/powerCardTemp0/value
150234 1459681571710882 5.08301 1459681571710882 21.022
150235 1459681571710882 5.08301 1459681571710882 21.022
150236 1459681571897234 5.10103 1459681571710882 21.022
150237 1459681572012192 5.04543 1459681572012192 21.023
150238 1459681572100423 5.05232 1459681572100433 21.024
14HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Instrument data characteristics
Train dataexists for each train
Variable train datalike train data but may not exist for all trains
Pulse data received per trainpulses give extra dimension
Variable pulse data like pulse data but some values can be omitted (e.g. vetoed)
Run datadata valid for entire run
15HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Key to high-performance data writing:Column oriented HDF5 Structure
a
b
c
d
e
f
g
h
p
s
k
m
n
1
true
5
3.14
[1,2,3,4,5]
2
“this is a string”
12
1.5
false
Data comes as hierarchical key-value container
(“Hash” in Karabo)
a
b
c
d
e
f
g
h
p
s
k
m
n
1
true
5
3.14
[1,2,3,4,5]
2
“running”
12
1.5
false
a
b
c
d
e
f
g
h
p
s
k
m
n
4
true
2
3.14
[7,1,0,4,5]
2
“running”
24
1.75
true
a
b
c
d
e
f
g
h
p
s
k
m
n
3
false
5
4.24
[5,3,4,5,3]
3
“error”
15
1.85
false
One write / data update / data source
a
b
c
d
e
f
g
h
p
s
k
m
n
1
true
5
3.14
[1,2,3,4,5]
2
“running”
12
1.5
false
a
b
c
d
e
f
g
h
p
s
k
m
n
4
true
2
3.14
[7,1,0,4,5]
2
“running”
24
1.75
true
a
b
c
d
e
f
g
h
p
s
k
m
n
3
false
5
4.24
[5,3,4,5,3]
3
“error”
15
1.85
false
a
b
c
d
e
f
g
h
p
s
k
m
n
1,4,3
true,true,false
5,2,5
3.14,3.14,4.24
[1,2,3,4,5,7,1,0,4,5,5,3,4,5,3]
2,2,3
“running”,”running’’,“error”
12,24,15
1,5,1,75,1.85
False,true,false
One write / multiple data update / data source
Internal buffering of N updates
Column oriented Hash vectorization
Record oriented
16HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
FPGA-Accelerated Data Compression
Storage space getting an issue:Current 2D detectors: 4 Mpixel, up to 500 frames per train=> 80 GB/s (4 bytes per pixel)
Compression CPU intense
=> Study FPGA-acceleration
IBM Power8 with FPGA-based accelerator
GenWQE/PCIe GZIP Accelerator
@p8.desy.de © Copyright IBM 2016
17HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
CXIDB: Coherent X-ray Imaging Data Bank
LCLS: SLAC Linac Coherent Light Source
Test data files
Raw data files from LCLS detectors:
eXtended Tagged Container - (.xtc) format
“Data files” / diffraction pattern - (.cxi) format
HDF5, NeXus-inspired and ~compatible
Data available from http://cxidb.org
Sequential crystallography: idb22►beamline CXI @LCLSNot-so-weakly scattering: idb30►beamline AMO @LCLSWeakly scattering – TODO
18HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Same executable run seamlessly with software or hardware-accelerated compression (no recompile is needed)
Enable FPGA compression by setting environment variables:
ZLIB_ACCELERATOR=GENWQE
LD_PRELOAD=/usr/lib64/genwqe/libz.so.1 /path_to/prog
Comparison criteria between SW and HW compression:space saving = 1 – 1/comp_ratio
►comp_ratio = uncompressed_size / compressed_sizedata compression rate (speed)
►I/O from/to 3 alternatives: disk / memory / null►Disk: [email protected]
Compression with FPGA
19HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Compression rates with FPGA
Sequential crystallography, idb22, ~175GB of 16 TBSpace saving [41-51%], depends on run #Data rate (single thread):
►Disk: 0.95 GB/s RAM: 1.05 GB/s RAM/null O: 1.12 GB/s
Not-so-weakly scattering, idb30, ~210 GB of 4 TBSpace saving [32-42%], depends on run #Data rate (single thread):
►Disk: 0.85 GB/s RAM: 0.89 GB/s RAM/null O: 1.00 GB/s
20HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Other software implementations (gzip, custom ‘gzip’) can be 6-7% faster + on different machines (exflpclXXnY), up to [19, 21]% faster.
Comparison against software
Speed / data rateFPGA: ~1GB/s. ~100x faster than software:
►id22: [8.6, 8.8] MB/s►id30: [8.7, 9.9] MB/s
Storage saved:FPGA: [32, 51]% raw data storage saveSoftware could save more, [41, 57]%:
►id22: [48, 57]% => ~[12,17]% (relative) higher space saving►id30: [41,50]% => ~[17,27]% (relative) higher space saving
21HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Conclusions and future workEuropean XFEL tightly integrates data acquisition into Karabo control system
Different data sources: “fast & big”, “fast”, and “slow/control” data
Column-oriented HDF5 file structure
Database-like tables correspond to datasetsData buffering/bulk writing is faster than single writes
FPGA-accelerated compression on IBM power8
API is well integrated with Karabo and HDF5 librariesStorage saving: Raw data: ~32-51% FPGA compression rates “close to” 1 GB/sFPGA speed is faster than software by x[93, 128]Next step►Use real data from last user operation►Disk I/O setup using GPFS►Test new generation GenWQE - CAPI
• Federico Montesino Pouzols (ex-member of ITDM)• Control and Analysis Software group (CAS) for Karabo• IBM for providing test hardware and support
Acknowledgements
23HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
DAQ and DM components
PC layer: data acquisition layer
Set of data aggregator devices running on cluster of high-performance computers Collect, monitor and store experiment dataDisseminate data to online analysis pipeline (fast-feedback)
Run management service
Coordinates PC layer activities related to data
Metadata catalogue (MDC)
Organize and keep track of experiment data in a homogeneous and consistent way
Data retrieval service
Common data interface with analysis algorithms (online, offline)
24HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Data series types
Variable pulse data
Train control data
Train data
Pulse data
Variable train data
Run data
Train Event data/time series
Digitizer
2D pixel detector
e.g. Pulse digitizer, BPM, …
eg. 2D pixel detector images
eg. 2D pixel detector train info
eg. Slow camera
eg. Sensors, actuators, motors, …
eg. 2D detector configuration
Det
erm
inis
tic p
ush
Eve
nt d
riven
Data integrated over multiple trains
25HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Correlation of instrument and control data
Sequence number can be calculated assuming statically configured distribution of train data over multiple channels
filename File content (train header)
r0001.s0000.i.det1.h5 T x y
0 ... ...
3 ... ...
r0001.s0001.i.det1.h5 T x y
1 ... ...
4 ... ...
r0001.s0002.i.det1.h5 T x y
2 ... ...
5 ... ...
r0001.s0003.i.det1.h5 T x y
6 ... ...
9 ... ...
r0001.s0004.i.det1.h5 T x y
7 ... ...
10 ... ...
r0001.s0005.i.det1.h5 T x y
8 ... ...
11 ... ...
r0001.s0006.i.det1.h5 T x y
12 ... ...
For the consecutive train ids:
� � � � � �� ���
� � �� ��
����
� � � � � �� �� � �
�
� � � � �,� , � � � � �, �
Can be generalized for other train patterns
T Train Id
T0 First train Id within the run
Nc Number of data acquisition channels.
N Number of train blocks per file
c Channel id. Channels are ordered.
i Record id for train based data within a table {i=0,…,N-1}
m File sequence number within the channel
s File sequence number within the run
T0
= 0, Nc=3, N=2, infix=lpd
These parameters are stored
as group attribute in the stub file
Example:
T0
= 0, Nc=3, N=2
Where is train T = 10?
c = 1, m = 1, i = 1, s = 4
Train T=10 is stored in file:
r0001-lpd-s0004.h5
at record i=1
r0001-lpd-s0000.h5
r0001-lpd-s0001.h5
r0001-lpd-s0002.h5
r0001-lpd-s0003.h5
r0001-lpd-s0004.h5
r0001-lpd-s0005.h5
r0001-lpd-s0006.h5
26HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Instrument data
RecordIdx
Train Id linkId pulseCount
detectorDataBlock
TrainDataBlock
…
0 234 10 3 a@2w4347...\
Q;slwwd%
1 250 10 4 A%%ad8*8__
@#$8uuq
Record Idx Idxpulsedata
Train Id Pulse Id Image
0 1 234 150
1 2 234 843
2 4 250 45
3 6 250 91
Record Idx IdxVariablepulse data
Idxtrain data
Train Id Pulse Id cellId status
0 -1 0 234 0 0 0
1 0 0 234 150 2 1
2 1 0 234 843 34 0
3 -1 1 250 14 14 0
4 2 1 250 45 45 0
5 -1 1 250 55 55 1
6 3 1 250 91 32 0
train_data (header, trailer, det. spec.)
pulse_data (descriptors)
var_pulse_data (Images)
� Optimize for size when necessary
� Create indexes to navigate between different groups of data
27HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
Chapter break
28HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
How to edit the title slide
Upper area (title): Title of your talk, max. 2 rows of the defined size (28 pt)
Lower area (subtitle):your name and affiliation, location, date, max. 4 rows of the defined size (22 pt)
2
1
2
1
29HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
How to use slide layouts
New slide: Click on the text “New Slide” (not on the icon) to select one of the templates.
New layout:Click on the button “Layout” for changing the layout of an existing slide.
Reset:Use the button “Reset” to re-apply the selected layout.
2
1
23
3
1
30HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
How to edit text
Bullets:New text slides are showing orange bullet points. For writing copy text: go to the beginning of the text slide and press backspace to delete the bullet point.
Indent:The first level shows an orange bullet point.Select “Increase list level” to go to the nextlevels. “Decrease list level” will bring you back to the first level.
2
2
1
1
31HDF5 at the European XFEL Dr. Gero Flucke for Djelloul Boukhelef, HDF5 Workshop @ ICALEPCS 2017
How to edit the header
Info:The header shows the title of the presentation, your name/function and the date. The text can be edited in the slide master.
1
1