[email protected] US CMS Data Preservation Discussion, 16 February 2012 CMS Data Preservation Policy On...
-
Upload
kerry-park -
Category
Documents
-
view
214 -
download
0
Transcript of [email protected] US CMS Data Preservation Discussion, 16 February 2012 CMS Data Preservation Policy On...
[email protected] US CMS Data Preservation Discussion, 16 February 2012
CMS Data Preservation PolicyOn behalf of
Data preservation working group
Active Members: Paoti Chang, David Colling, Andreas Heiss, Kati Lasilla-Perini (chair), Sudhir Malik, Patty McBride, Jesus Marco, Liz Sexton-Kennedy, Lucas Taylor, Roberto Tenchini, (CMS)
Sunje Dallmeier-Tiessen, Salvatore Mele (CERN INSPIRE/Open Access)
[email protected] US CMS Data Preservation Discussion, 16 February 2012 2
Why are we here today?
• The CMS Collaboration Board in February 2011 set the following mandate: “To produce the CMS data preservation and access policy and a implementation plan, in coordination with CERN and other LHC experiments, to be approved by the Collaboration”
• Subsequently a task force was created for this purpose
• There have been 10 meetings of this task force to prepare the above policy and plan, made 2 presentations to the CMS
• Many of you have provided excellent input to it and also had genuine concerns
• After several iterations and inputs/concerns, we have a latest draft of the policy that is planned to be presented at the CMS CB meeting March 2, 2012
• Today we are here to discuss this policy as a USCMS community and seek agreement
[email protected] US CMS Data Preservation Discussion, 16 February 2012 3
Why is this important to (US)CMS?• Requests for open access to publicly funded data becoming frequent from our funding agencies
• We are not “special” - our very hard earned funding is in competition with other fields which already offer open access to their data
• Climate, Astrophysics, Molecular biology …..
• Funding agencies may want more value for their money and ask for more than we may be ready to give
• We now have a unique opportunity to define a policy before it is imposed on
• review our Modus operandi and adapt our practices to still make it in time
• Enable us in a position to provide a unique response to the present and future requests
[email protected] US CMS Data Preservation Discussion, 16 February 2012 4
Who benefits the most?
• The main beneficiary of data preservation effort is the CMS collaboration itself (though not the domain of the policy)
• Use the opportunity of providing open access as the driving force to preserve data for our own possible future needs, preserve know-how
• Data preservation activities would be a natural long-term extension to the current CMS way of operation
• In other sciences, where open access has been provided, additional resources have been attracted
[email protected] US CMS Data Preservation Discussion, 16 February 2012 5
What is being proposed?
• We are proposing a policy for an approval
• This policy is a commitment from the CMS collaboration to preserve the data and access to a part of it as defined in the policy
• Only after the policy is approved can we have an appropriate structure in place to address the technical issues
[email protected] US CMS Data Preservation Discussion, 16 February 2012 6
Current CMS practices• In general, CMS way of operation is already aligned with the long- term data preservation and re-use
• CMS is already doing a lot - AOD reprocessing, raw data compatibility for new software versions, public, open access results, software and documentation publically available, outreach...
• These practices need to be extended from the current short time-scale to ensure the long-term usability of the data
• There are many details which need to taken care of with appropriate resources and some new habits need to be adopted already now to ensure the preservation of the low-energy and low-occupancy data.
[email protected] US CMS Data Preservation Discussion, 16 February 2012 7
Level 1 - publications in open access journals, supporting documents and numerical additional data• made available at the time of publication
Level 2 - simplified data formats for theory interpretations, limited analysis, education, outreach • samples released promptly as determined by the CB
Level 3 - reconstructed data and simulations, together with the software, analysis workflows and documentation
• samples released yearly during the long LHC machine shutdown periods, and at best effort during LHC running• after an embargo period of 3 years limited to max 50% of the amount of data (in integrated luminosity) which is available to the collaboration• a first stress-test exercise with a public release of a part of 2010 data in 2013 after which the experience is reviewed. In absence of unexpected overhead the public data release can be accepted as a standard procedure
Level 4 - raw data and the software and documentation needed to access, reconstruct and analyze them
• CMS will decide whether to extend the public access to Level-4 data after the experience of the first public release of the Level-3 data has been reviewed and evaluated.
Salient features of the policy
[email protected] US CMS Data Preservation Discussion, 16 February 2012 8
CMS Data use outside the CMS collab.
By Whom
• CMS Associates - As defined in the CMS Constitution
• Experimental physicists, Theorists, Other scientists, Education, Outreach, Citizen scientists
How
• Cite public CMS data
• Re-use the CMS data (responsibility of the final user)• Data released under Creative Commons CC0 waiver
[email protected] US CMS Data Preservation Discussion, 16 February 2012 9
• Links to past meetings and presentations• Links to the preparatory meetings from here:
https://indico.cern.ch/search.py?categId=0&p=data+preservation+EVO&f=&collections=&startDate=&endDate=&sortField=&sortOrder=d
• Links to presentations to the CMS• At the CMS week (Brussels, Sep 2011)
https://indico.cern.ch/conferenceDisplay.py?confId=152422 (see under “Data preservation - Progress report ”)
• At the CMS CB (Open to all CMS)(30 Nov, 2011) https://indico.cern.ch/conferenceDisplay.py?confId=162331
• The latest draft of the policy is attached to the agenda of this meeting• Answers to some pertinent questions are also attached to the agenda of this meeting
Useful Links
[email protected] US CMS Data Preservation Discussion, 16 February 2012 10
Over to the discussion