AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei...
-
Upload
joan-bailey -
Category
Documents
-
view
212 -
download
0
description
Transcript of AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei...
![Page 1: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/1.jpg)
AMS02 Data Volume, Staging and Archiving Issues
AMS Computing Meeting
CERN April 8, 2002
Alexei Klimentov
![Page 2: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/2.jpg)
A.Klimentov AMS Computing Meeting, CERN, Apr 2002
2
Outline AMS 02 data volume AMS/CASPUR Technical Meeting – Mar
2002 Projected Characteristics for disks,
processors and tapes AMS data storage issues
![Page 3: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/3.jpg)
A.Klimentov AMS/CASPUR Technical Meeting, Bologna, Mar 2002
3
AMS Data Volume (Tbytes)
Data/Year 1998 200
12002
2003
2004
2005
2006
2007
2008
2009 Total
Raw 0.20 ---- --- --- --- 0.5 15 15 15 0.5 46.2
ESD 0.30 --- --- --- --- 1.5 44 44 44 1.5 135.3
Tags 0.05 --- --- --- --- 0.1 0.6 0.6 0.6 0.1 2.0Data&ESD 0.55 --- --- --- --- 2.1 59.6 59.6 59.6 2.1 183.
5
MC 0.11 1.7 8.0 8.0 8.0 8.0 44 44 44 44 210.4
GrandTotal
0.66 1.7 8.0 8.0 8.0 10.1 104 104 104 46.1 ~400
STS9STS911
AMS02 on ISSAMS02 on ISS
![Page 4: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/4.jpg)
A.Klimentov AMS Computing Meeting, CERN, April 2002
4
AMS/CASPUR technical meeting, Bologna Mar, 2002 Participants : V.Bindi,M.Boschini, D.Casadei, A.Contin,
V.Choutko, A.Klimentov, A.Maslennikov, F.Palmonari, PG.Rancoita, PP.Ricci, C.Sbara, P.Zuccon
Topic : Archiving and staging strategy, AMS02 data volume
To propose coherent scheme for AMS data storage in SOC and Remote center(s).
Possible solutions : - disks servers - staging (tapes+disks) - outsourcing (CASTOR)
![Page 5: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/5.jpg)
A.Klimentov AMS Computing Meeting, CERN, Apr 2002
5
Staging - Staging system is a generic name for a tape-to-disk migration tool.
The files are migrated by user before they are about to be accessed on the disk. Migration of the disk files to tape may be automatic or manual.
- Older known staging implementations required the user to keep track of his/her tape files (old CERN staging)
- CASPUR flavour (in production since 1997) does the tape/file bookkeeping on behalf of the user. It uses NFS,
and features a fairly easy installation and management.
- CASTOR (CERN, project started in 1999) gives a user an option to migrate files both manually, and via the specially modified I/O
calls from within a program. Uses a fast data transfer protocol (RFIO).
Installed and maintained by CERN IT since 2000, currently used by COMPASS to store raw data and ESD, also ALICE and CMS made I/O tests. Currently the primary option for LHC experiments.
![Page 6: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/6.jpg)
A.Klimentov AMS Computing Meeting, CERN Apr 2002
6
Projected characteristics for disks, processors and tapes
Components 1998 2002 2006
Intel/AMD PC
Dual-CPU Intel PII, rated at 450 MHz, 512 MB RAM.
7.5 kUS$
Dual-CPU Intel,Rated at 2.2 GHz,1GB RAM andSCSI and IDE RAID controllers 7 kUS$
Dual-CPU rated at 8GHz, 2GB RAM and IDE RAID controller
5 kUS$
Magnetic disk
18 GByte SCSI 80 US$/Gbyte
SG 180 GByte SCSI 10 US$/Gbyte WD 120 Gbyte IDE 2 US$/GbyteIDE-FC 5.5 US$/Gbyte
SCSI 700 Gbyte2 US$/GbyteIDE 800 Gbyte 0.6 US$/GbyteIDE-FC 1.3 US$ /Gbyte
Magnetic tape
DLT 40 GB compressed 3 US$/Gbyte
SDLT and LTO200 GB compressed 0.8 US$/Gbyte
?600 GB compressed 0.3 US$/Gbyte
![Page 7: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/7.jpg)
A.Klimentov AMS Computing Meeting, CERN, April 2002
7
AMS staging and archiving system : requirements and considerations Storage strategy might be different for raw, ESD and
MC data. All data must be archived. At least two copies of raw
and ESD are required. I believe that data must be under control of AMS
collaboration Archiving system should be scalable and
independent from the HW technology Data Volume 2002 - 8 TB 2008 - 500 TB Throughput 2TB/day 23MB/sec
![Page 8: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/8.jpg)
A.Klimentov AMS Computingl Meeing, CERN, Apr 2002
8
Cost estimation (I) Disks servers
2002-2005 8 TB/Year RAID5 2.1 TB / server 3-4 servers/ year 23.3
kUS$/server/2002 50% disk’s price
drop/year, migration to IDE disks system
197 kUS$/total 6.2 US$/GByte
2006 and beyond 100 TB/Year RAID5 5.6 TB /server 10-15 servers/year 8.8 kUS$/server/2006 50% disk’s price
drop/year 411 kUS$/total 1.4 US$/GB
![Page 9: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/9.jpg)
A.Klimentov AMS Computing Meeting, CERN, Apr 2002
9
Cost estimation (II) Staging2002-2005 8 TB/Year LTO Library 58 kUS$ 2 servers 10 kUS$ 20-40 Cartridg./year FC switch 15 kUS$ 0.8 TB disks/year 30% IDE-FC disk’s
price drop/year 111 kUS$/total 3.5 US$/GB
2006 and beyond 100 TB/Year LTO Libray/ biennial 2 servers / year 150-250 Cartridg. /year FC switch 15 kUS$ 10 TB disks/ year 30% IDE-FC disk’s
price drop/year 300 kUS$/total 1 US$/GB
![Page 10: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/10.jpg)
A.Klimentov AMS Computing Meeting, CERN, Apr 2002
10
Cost estimation (III) Castor
2002-2005 8 TB/Year 1.8 US$/GByte 57.6 kUS$/total
2006 and beyond 100 TB/Year 0.8 US$/GByte 240 kUS$/total
![Page 11: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/11.jpg)
A.Klimentov AMS Computing Meeting, CERN, Apr 2002
11
Storage Solution (Summary)DiskServers Staging CASTOR
System Complexity
High : 50 servers 0.5 PB online
Medium : 8 servers0.05 PB online
Low
Cost kUS$ (2002/2006) 197/411 111/ 300 57.6/240
Data access Real-time 5-10 mins delay
10-20 mins delay
Manpower 0.5 FTE 0.5 FTE 0.1 FTESystem availability (short/long term )
fall 2002/ 2005 fall 2002/ 2005 (R&D req) May 2002/ ?
Special Issue AMS controlled
AMS controlled
CERN & AMS controlled
![Page 12: AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.](https://reader036.fdocuments.in/reader036/viewer/2022090107/5a4d1bf97f8b9ab0599eb06b/html5/thumbnails/12.jpg)
A.Klimentov AMS Computing Meeting, CERN, Apr 2002
12
Conclusion CASTOR might be the best solution for the short term and MC
data storage, CERN central maintenance is one of its advantages. I won’t suggest to use CASTOR for AMS critical applications and one should note that due to CERN budget cut the cost/GB can be changed for non-CERN experiments and the priority always will be given to LHC groups.
Disk Servers solution is still too expensive to store ALL data, it also increases the complexity of the system (even if one
assumes that the same servers will be used for data processing) , for the Raw data and selected ESD it might be the way how we will proceed
Staging system represents the most cost/efficient solution for a case when AMS maintain full control of data. For the experiment lifetime the overall cost of staging system will be only 25% higher when the CASTOR. R&D requires to prove “CASPUR system” scalability to hundreds of Tbytes data volume and multi-servers/data movers proccesses.