The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.
-
Upload
maude-thomas -
Category
Documents
-
view
213 -
download
0
Transcript of The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.
![Page 1: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/1.jpg)
The DataTransfer statusExperience on VSR2
A. Bozzi, L. Salconi – 27 Oct 2009
![Page 2: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/2.jpg)
The new software procedures 1/2
We implemented a simple, robust replica manager architecture.
An automatic system that:
scans for new DAQ files and build metadata on it (based upon FrDump output); keep track of that files and order them in multiple queues (one for each kind of
file); prepares the data transfer sessions (builded on static configuration parameters) and
starts them, one session for each data flow; checks the sessions output status and performs some actions based on it
(basically different actions were perfomed on a succesful or failed data transfer); schedules a retry on a failed transfer session; keeps tracks of all operation scheduled (succesfull or failed); builds a metadata structure for each file (a raw ffl entry)
![Page 3: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/3.jpg)
The new software procedures 2/2
… and it also:has the same architecture and similar topology for each data flow: only the sendFile class changes, so we have some primitives that are wrapper around bbftp's and SRB's command (… and why not in a future on gridFTP).has a network “star configuration” (from Cascina to the CCs with 8 independent flows);collects informations on closed sessions only parsing the local log and the output of the performed operations in order to find the status of the transferred files;builds locally a remote ffl, based upon the FrDump output performed on the local file and mixing them with the static information on the remote destination directory;organize the data path in the same way in all repositories in order to have same script for search for missing files or errors.
![Page 4: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/4.jpg)
The Cascina – Bologna – Lyon star architecture
Lyon Bologna
datagw.virgo.infn.it
Procdata vols
Rawdata circular buffer
SRB bbftp
![Page 5: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/5.jpg)
The LIGO data interface (using LDR)
LIGO Lyon Bologna
dataldr.virgo.infn.it datagw.virgo.infn.itLIGO vols (RW)
Procdata vols (RO)
SRB bbftp
![Page 6: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/6.jpg)
The achieved performance
2009-07-20 16:45:27,108 INFO DtDBase: adding V-raw-932135940-180.gwf to rawdata queque2009-07-20 16:45:28,563 INFO SRBEngine: [raw2ly] sending file V-raw-932135940-180.gwf2009-07-20 16:45:31,314 INFO BBEngine: [raw2bo] sending file V-raw-932135940-180.gwf2009-07-20 16:46:32,715 INFO BBEngine: [raw2bo] file V-raw-932135940-180.gwf successfully sent2009-07-20 16:46:36,227 INFO BBEngine: [raw2bo] sent updated ffl ./ffl/raw2bo.ffl2009-07-20 16:46:57,363 INFO SRBEngine: [raw2ly] file V-raw-932135940-180.gwf successfully sent2009-07-20 16:47:00,978 INFO SRBEngine: [raw2ly] sent updated ffl ./ffl/raw2ly.ffl
fflGen.pl [Mon Jul 20 16:45:27 2009] -> file to insert V-raw-932135940-180.gwf on st4rear::v081fflGen.pl [Mon Jul 20 16:45:27 2009] -> sending infos about V-raw-932135940-180.gwf to dataSendfflGen.pl [Mon Jul 20 16:45:27 2009] -> sending infos about V-raw-932135940-180.gwf to dataBackupfflGen.pl [Mon Jul 20 16:45:27 2009] -> generate a new ffl file...fflGen.pl [Mon Jul 20 16:45:34 2009] -> ...public ffl file updated with 87962 records
An example with a VSR2 rawdata file: (V-raw-932135940-180.gwf)
→ available in Cascina to users (circular buffer) at 16:45:27→ published in the local ffl in Cascina at 16:45:34→ available in Bologna (published with ffl) at 16:46:36 (1'09”)→ available in Lyon (published with ffl) at 16:47:00 (1'33”)
![Page 7: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/7.jpg)
The amount of data sent to CCs
(27 Oct 09 – 10:00am) Bologna Lyon
n. files space used (TB) n. files space used (TB)
raw (931-933) 16605 30 16605 30raw (934-936) 16667 31 16667 31raw (937-939) 16666 32 16666 32raw (940-now) 3720 6.6 3720 6.6
proc 2401 1.3 2401 1.3
LIGO (S6/H1) 528 1.1 528 1.1LIGO (S6/L1) 466 0.9 466 0.9
57053 102.9 57053 102.9
Here is the amount of data sent to remote CCs until now (27 oct '09 at 10:00am):- from logs we see that we are in a “just in time” situation for about the 93% of the data transfer activity (this means that we got a delay of about 2 minutes between the publication of the file in Cascina and the availability of the file replica at remote CCs)
- at this moment, only 3 files were missed (2 raw and 1 proc on a total of about 53000 files) from the sent list (due to exceptions not managed by the procedure).Problems manually fixed.
![Page 8: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009.](https://reader036.fdocuments.in/reader036/viewer/2022083006/56649f2f5503460f94c49d34/html5/thumbnails/8.jpg)
Conclusions
We achieve a good level of performance for all the 8 independent data flows active: (rawdata, hreconline, ligo H1, ligo L1 each from Cascina to Bologna and Lyon)
No particular problems were detected in Bologna: only two file missing from the list;
Some problems were detected in Lyon, one for a missing file, all other are related to the SRB interface:
Sput command sometimes lock (a manual procedure is needed for unlock it) good FFL files were transferred to Lyon but they result to be a zero file length at
destination sometimes we loose the synchronization between the SRB/xrootd layer and the HPSS layer
(ex: the Smv command).
About this problems, we got a good support from the Lyon SRB service team