AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei...

12
AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov

description

A.Klimentov AMS/CASPUR Technical Meeting, Bologna, Mar AMS Data Volume (Tbytes) Data/ Year Total Raw ESD Tags Data& ESD MC Grand Total ~400 STS91 AMS02 on ISS

Transcript of AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei...

Page 1: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

AMS02 Data Volume, Staging and Archiving Issues

AMS Computing Meeting

CERN April 8, 2002

Alexei Klimentov

Page 2: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, Apr 2002

2

Outline AMS 02 data volume AMS/CASPUR Technical Meeting – Mar

2002 Projected Characteristics for disks,

processors and tapes AMS data storage issues

Page 3: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS/CASPUR Technical Meeting, Bologna, Mar 2002

3

AMS Data Volume (Tbytes)

Data/Year 1998 200

12002

2003

2004

2005

2006

2007

2008

2009 Total

Raw 0.20 ---- --- --- --- 0.5 15 15 15 0.5 46.2

ESD 0.30 --- --- --- --- 1.5 44 44 44 1.5 135.3

Tags 0.05 --- --- --- --- 0.1 0.6 0.6 0.6 0.1 2.0Data&ESD 0.55 --- --- --- --- 2.1 59.6 59.6 59.6 2.1 183.

5

MC 0.11 1.7 8.0 8.0 8.0 8.0 44 44 44 44 210.4

GrandTotal

0.66 1.7 8.0 8.0 8.0 10.1 104 104 104 46.1 ~400

STS9STS911

AMS02 on ISSAMS02 on ISS

Page 4: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, April 2002

4

AMS/CASPUR technical meeting, Bologna Mar, 2002 Participants : V.Bindi,M.Boschini, D.Casadei, A.Contin,

V.Choutko, A.Klimentov, A.Maslennikov, F.Palmonari, PG.Rancoita, PP.Ricci, C.Sbara, P.Zuccon

Topic : Archiving and staging strategy, AMS02 data volume

To propose coherent scheme for AMS data storage in SOC and Remote center(s).

Possible solutions : - disks servers - staging (tapes+disks) - outsourcing (CASTOR)

Page 5: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, Apr 2002

5

Staging - Staging system is a generic name for a tape-to-disk migration tool.

The files are migrated by user before they are about to be accessed on the disk. Migration of the disk files to tape may be automatic or manual.

- Older known staging implementations required the user to keep track of his/her tape files (old CERN staging)

- CASPUR flavour (in production since 1997) does the tape/file bookkeeping on behalf of the user. It uses NFS,

and features a fairly easy installation and management.

- CASTOR (CERN, project started in 1999) gives a user an option to migrate files both manually, and via the specially modified I/O

calls from within a program. Uses a fast data transfer protocol (RFIO).

Installed and maintained by CERN IT since 2000, currently used by COMPASS to store raw data and ESD, also ALICE and CMS made I/O tests. Currently the primary option for LHC experiments.

Page 6: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN Apr 2002

6

Projected characteristics for disks, processors and tapes

Components 1998 2002 2006

Intel/AMD PC

Dual-CPU Intel PII, rated at 450 MHz, 512 MB RAM.

7.5 kUS$

Dual-CPU Intel,Rated at 2.2 GHz,1GB RAM andSCSI and IDE RAID controllers 7 kUS$

Dual-CPU rated at 8GHz, 2GB RAM and IDE RAID controller

5 kUS$

Magnetic disk

18 GByte SCSI 80 US$/Gbyte

SG 180 GByte SCSI 10 US$/Gbyte WD 120 Gbyte IDE 2 US$/GbyteIDE-FC 5.5 US$/Gbyte

SCSI 700 Gbyte2 US$/GbyteIDE 800 Gbyte 0.6 US$/GbyteIDE-FC 1.3 US$ /Gbyte

Magnetic tape

DLT 40 GB compressed 3 US$/Gbyte

SDLT and LTO200 GB compressed 0.8 US$/Gbyte

?600 GB compressed 0.3 US$/Gbyte

Page 7: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, April 2002

7

AMS staging and archiving system : requirements and considerations Storage strategy might be different for raw, ESD and

MC data. All data must be archived. At least two copies of raw

and ESD are required. I believe that data must be under control of AMS

collaboration Archiving system should be scalable and

independent from the HW technology Data Volume 2002 - 8 TB 2008 - 500 TB Throughput 2TB/day 23MB/sec

Page 8: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computingl Meeing, CERN, Apr 2002

8

Cost estimation (I) Disks servers

2002-2005 8 TB/Year RAID5 2.1 TB / server 3-4 servers/ year 23.3

kUS$/server/2002 50% disk’s price

drop/year, migration to IDE disks system

197 kUS$/total 6.2 US$/GByte

2006 and beyond 100 TB/Year RAID5 5.6 TB /server 10-15 servers/year 8.8 kUS$/server/2006 50% disk’s price

drop/year 411 kUS$/total 1.4 US$/GB

Page 9: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, Apr 2002

9

Cost estimation (II) Staging2002-2005 8 TB/Year LTO Library 58 kUS$ 2 servers 10 kUS$ 20-40 Cartridg./year FC switch 15 kUS$ 0.8 TB disks/year 30% IDE-FC disk’s

price drop/year 111 kUS$/total 3.5 US$/GB

2006 and beyond 100 TB/Year LTO Libray/ biennial 2 servers / year 150-250 Cartridg. /year FC switch 15 kUS$ 10 TB disks/ year 30% IDE-FC disk’s

price drop/year 300 kUS$/total 1 US$/GB

Page 10: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, Apr 2002

10

Cost estimation (III) Castor

2002-2005 8 TB/Year 1.8 US$/GByte 57.6 kUS$/total

2006 and beyond 100 TB/Year 0.8 US$/GByte 240 kUS$/total

Page 11: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, Apr 2002

11

Storage Solution (Summary)DiskServers Staging CASTOR

System Complexity

High : 50 servers 0.5 PB online

Medium : 8 servers0.05 PB online

Low

Cost kUS$ (2002/2006) 197/411 111/ 300 57.6/240

Data access Real-time 5-10 mins delay

10-20 mins delay

Manpower 0.5 FTE 0.5 FTE 0.1 FTESystem availability (short/long term )

fall 2002/ 2005 fall 2002/ 2005 (R&D req) May 2002/ ?

Special Issue AMS controlled

AMS controlled

CERN & AMS controlled

Page 12: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

A.Klimentov AMS Computing Meeting, CERN, Apr 2002

12

Conclusion CASTOR might be the best solution for the short term and MC

data storage, CERN central maintenance is one of its advantages. I won’t suggest to use CASTOR for AMS critical applications and one should note that due to CERN budget cut the cost/GB can be changed for non-CERN experiments and the priority always will be given to LHC groups.

Disk Servers solution is still too expensive to store ALL data, it also increases the complexity of the system (even if one

assumes that the same servers will be used for data processing) , for the Raw data and selected ESD it might be the way how we will proceed

Staging system represents the most cost/efficient solution for a case when AMS maintain full control of data. For the experiment lifetime the overall cost of staging system will be only 25% higher when the CASTOR. R&D requires to prove “CASPUR system” scalability to hundreds of Tbytes data volume and multi-servers/data movers proccesses.