CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE...

36
CHEP 2000: 7-11 February, 2000 I. Sfilig oi Data Handling in KLOE 1 CHEP 2000 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy

Transcript of CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE...

Page 1: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE1

CHEP 2000CHEP 2000

Data Handling in KLOEI.Sfiligoi

INFN LNF, Frascati, Italy

Page 2: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE2

The KLOE experimentThe KLOE experiment

• at DANE -factory• main goal:

• CP violation study• other interesting fields:

• kaon form factors• kaon rare decays• radiative decays

KS+- KL+- (CP not)

KS+- KL306

Page 3: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE3

KLOE RequirementsKLOE Requirements

• Data acquisition (at full DANE luminosity)

• 1011 events per year acquired• 50 MB/s sustained throughput

• Computing power• ALL the events need to be reconstructed

• Storage requirements• one petabyte of raw and reconstructed events• hundreds of megabytes of related data

(configurations, slow control data, calibration parameters, etc.)

Page 4: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE4

KLOE computing KLOE computing environmentenvironment

• Based on a set of medium-sized servers• Connected using commercial switched

networks (Fast Ethernet and Gigabit Ethernet)

• Heterogeneous environment, several platforms:• IBM AIX on PowerPC• Sun Solaris on Sparc• Compaq Tru64 Unix on Alpha• HP-UX on PA-RISC

Page 5: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE5

KLOE storage poolKLOE storage pool

• Different policies for different types of data:• raw and reconstructed events on tape libraries,

with big disk pools for data caching• related data managed by a disk based database

system• analysis output on disk pools

Page 6: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE6

Disk poolsDisk pools

• Four categories of disk pools are present:• each data acquisition node in the farm has its

own small disk pool• computing nodes write their output to

centralized, NFS mounted disk pools• separate disk pools are used as a cache for the

events on tape• analysis output is written to its own, central

AFS mounted disk pool

Page 7: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE7

Tape libraryTape library

• Several automated tape libraries supported(at the moment the 5500 slot tape library is partitioned between two tape servers)

• Accessed using commercial software• IBM ADSM with the current tape library

Page 8: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE8

KLOE softwareKLOE software

• Three distinct categories • DAQ (or online)• reconstruction and

analysis (or offline)• Monte Carlo

ANSI C

FORTRAN inside A_C

FORTRAN

The interface to the Data Handling System must be compatible with all of them

Page 9: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE9

KLOE Data Handling SystemKLOE Data Handling System

• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)

Page 10: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE10

KLOE Data Handling SystemKLOE Data Handling System

• A mix of commercial and custom software

• the dependency on commercial software is minimized by the layers of custom software

• commercial software carries on all the vital functions

•custom software mostly extends and coordinates the functionality of the commercial software

Page 11: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE11

KLOE Data Handling SystemKLOE Data Handling System

• Based on a set of multi-threaded non-privileged daemons and related libraries

• Distributed across several nodes• Communication by means of TCP/IP sockets

on high portsbypasses TCP/IP filteringflexible, programming language and

operating system independentno configuration needed on the client side

Page 12: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE12

KLOE Data Handling SystemKLOE Data Handling System

• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)

Page 13: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE13

Database SystemDatabase System

• Two distinct database systems are used

• offline database system

• online database system

based on HepDB data stored as ZEBRA banks

based on a Relational DBMS

data are structured in fieldsextended for distributed environments

Page 14: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE14

Online Database SystemOnline Database System

• data stored in a Relational DBMS• IBM DB2 Universal Database at the moment

• communication between the clients (user applications) and the RDBMS through a database daemon

RDBMS

DDapp

app app

Page 15: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE15

Database DaemonDatabase Daemon

• The database daemon is the only link between the applications and the RDBMS• if the RDBMS is changed in the future, only

the database daemon will need to be changed• Different kinds of commands are managed

by the daemon• general SQL commands• KLOE specific commands

Page 16: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE16

Database DaemonDatabase Daemon

• Different kinds of commands are managed by the daemon

• general SQL commands

•KLOE specific commands

•passed directly to the RDBMSselect run_nr from run_logger where status = 'OK'

•managed by the daemon itself•the RDBMS is used to retrieve and store data needed by the daemon itself

log that I am starting processing file relative to run 3

Page 17: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE17

Database DaemonDatabase Daemon

• The use of KLOE specific commands has several advantages• additional checks and restrictions are possible• data consistency management is centralized• fast central caches can be implemented

• for example, the DAQ configuration cache reduces the typical access time from 4 to 0.1 s

Page 18: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE18

A light versionA light version

• The RDBMS is used to ensure flexibility, reliability and performance

• Demanding in terms of computing resources and management effort• stand-alone environments often

cannot afford it• A RDBMS-independent version of the

database daemon is under development

Page 19: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE19

A light versionA light version

• A RDBMS-independent version of the database daemon is under development• limited to KLOE specific and the most

frequently used SQL commands• based on use of flat files containing a small

portion of the data• not suitable for production environment,

but enough for home use

Page 20: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE20

KLOE Data Handling SystemKLOE Data Handling System

• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)

Page 21: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE21

KLOE Archiving SystemKLOE Archiving System

• Expected event data managed by KLOE• 1 PB

• Tape libraries needed• data storage and retrieval non trivial• random access to data very inefficient

• Disk-based intermediate buffers used

Page 22: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE22

KLOE Archiving SystemKLOE Archiving System

• Two types of intermediate buffers• DAQ, offline and Monte Carlo output are

structured as YBOS files and written on their disk output areas

• event data needed by offline as input are read from the archiving system disk-cache

Page 23: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE23

KLOE Archiving SystemKLOE Archiving System

• Data needs to be migrated• from output areas to the tape library

• as soon as possible(taking into account also efficiency concerns)

• from the tape library to the disk cache• when an application needs it

(or even better, a bit earlier)• Migration is totally automated and

transparent to the applications

Page 24: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE24

KLOE Archiving SystemKLOE Archiving System

• The Archiving System is made of four components• storage managers• disk space managers

• output areas• cache areas

• archival director• cache manager

• Communication by means of TCP/IP sockets• Coordinated by the online database

archADSM

spacekeeper

filekeeperarchiverretrieve

Page 25: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE25

Storage ManagersStorage Managers

• One for each logical tape library• Allows

• queries about tape library content• file archival• file retrieval

• Transaction oriented(if the underlying tape library software supports it)

Page 26: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE26

Storage ManagersStorage Managers

• The only link between the tape library and the rest of the system• interface independent of the underlying

archiving software • IBM ADSM is used with the current tape

library• if other products is used in the future, only a

specific storage manager will need to be developed

Page 27: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE27

Disk Space ManagersDisk Space Managers

• One for each disk pool• Create and delete files

• unused files get deleted to make space for new ones

Page 28: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE28

Archival DirectorArchival Director

• Fully automated• Works in polling mode

• from time to time looks for files ready to be archived

• starts archiving only when enough data is available

• Files are ordered and grouped to minimize the expected retrieve time

• Several groups of files can be archived in parallel

Page 29: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE29

Cache ManagerCache Manager

• User driven• when a file is needed, the application asks the

cache manager where it is located• a retrieve is performed by the manager if

needed• Several requests can be issued at the same

time• the manager reorders them internally to

minimize the tape mounts• Communication by means of TCP/IP sockets

Page 30: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE30

KLOE Archival SystemKLOE Archival System

archiver

archADSM archADSM

spacekeeper

filekeeper

spacekeeper

filekeeper

retrieve

DB

...

. . .

n

m

. . .k

NFS mount local file system TCP/IP socket TCP/IP socket

Tape LibraryTape Library

Disk Pool

Disk Pool

Disk Pool Disk Pool

Page 31: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE31

KLOE Data Handling SystemKLOE Data Handling System

• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)

Page 32: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE32

SpySpy System System

• KLOE data acquisition software allows the event data to be read-out before they get written to disk

• The mechanism that reads those data is called Spy

• Based on use of shared memory buffers• DAQ processes are piped using this mechanism• the spy system reads data from the buffers

without interfering with the DAQ

Page 33: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE33

KLOE Data Handling SystemKLOE Data Handling System

• Composed of four elements:• Database System• Archiving System• Spy System• KLOE Integrated Dataflow (KID)

Page 34: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE34

KLOE Integrated Dataflow KLOE Integrated Dataflow (KID)(KID)

• Integration library• database accesses and retrieve operations

hidden• Offers a single point of access to all the

services• URI-based selection

datarec:(run_nr=5000) and (stream='ksl')spy:/buffer

open a spy channel and pass the events to the application

read the list from DB, ask the cache manager for the files, pass the events from the files to the application

Page 35: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE35

Management effortManagement effort

• The entire system is managed by only a few people:• 3 people (2 full time) are engaged in KLOE

computing system management (including storage)

• 1 person is engaged in the development and management of the online database and the archiving system

• 2 people spend few percent of their time for the maintenance of the offline database

Page 36: CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE36

CHEP 2000CHEP 2000

Data Handling in KLOEI.Sfiligoi

INFN LNF, Frascati, Italy