PPT Slides

download PPT Slides

of 63

  • date post

    05-Jul-2015
  • Category

    Technology

  • view

    6.185
  • download

    2

Embed Size (px)

Transcript of PPT Slides

  • 1. Tomorrow, and tomorrow, and tomorrow:the players on the curation stage Chris Rusbridge Presentation at OCLC

2.

  • "To-morrow, and to-morrow, and to-morrow,
  • Creeps in this petty pace from day to day,
  • To the last syllable of recorded time;
  • And all our yesterdays have lighted fools
  • The way to dusty death.
  • Out, out, brief candle!
  • Life's but a walking shadow; a poor player,
  • That struts and frets his hour upon the stage,
  • And then is heard no more: it is a tale
  • Told by an idiot, full of sound and fury,
  • Signifying nothing."
  • Shakespeare: Macbeth

3.

  • Dunsinane Hill
  • Photo by Fabrice

4. 5. 6. Contents

  • Curation and the Digital Curation Centre
  • Science and Data Citations
  • The poor players of data curation
  • Sustainability of curated data
  • Macbeth again

7. Curation

  • Data increasingly important as evidence
    • Experimental verifiability (the basis of science)
    • Unrepeatable observations & experiments (particularly environmental in broadest sense)
    • Legal, compliance & transactions
    • Cultural resources
  • Preservation view vs Publishing view

8. Lynch remarks

  • Closing the Curation Conference
  • 3 views of digital curation
    • Finite process, handover to preservation
    • Whole life process, evolving object(s)
    • Collection as a living thing

9. Digital curation? Digital preservation Static For later use 10. Digital curation? Digital preservation Digitalcuration Static DynamicLong-term For later useIn use now(and the future) 11. Digital curation Digital curation & preservation Static DynamicLong-term For later useIn use now(and the future) maintaining and adding value to a trusted bodyof digital information for current and future use 12. Mission

  • The over-riding purpose of the DCC is to support and promote continuing improvement in the quality of data curation, and of associated digital preservation

13. Organisation to Engage & Collaborate Industry research collaborators standards bodies testbeds & tools communities of practice: users community support & outreach research development co-ordination service definition & delivery management & admin support Associates Network curationorganisationseg DPC 14. Organisation to Engage & Collaborate: Leads Industry research collaborators standards bodies testbeds & tools communities of practice: users Bath Edinburgh CCLRC Glasgow Edinburgh Associates Network curationorganisationseg DPC 15. Associated work

  • DCC LOCKSS Technical Support Service
    • (Lots of Copies Keep Stuff Safe)
  • DCC SCARP Project
    • Disciplinary approaches to sharing, curation, re-use and preservation
  • EU projects associated
    • CASPAR
    • Digital Preservation Europe
    • PLANETS

16. Phase 2

  • Externally-moderated, reflective self-evaluation completed
  • Phase 2 proposal (2007/10) to JISC
    • Accepted: focus on science data, reduced scale
  • EPSRC-funded Research continues until 2007/8

17. 2nd International Digital Curation Conference

  • Research & invited presentations
  • Glasgow, 21/22 November, 2006
  • Please register at:http://www.dcc.ac.uk/events/dcc-2006/

18. 19. Data resource stages

  • Curated data is created
    • Observations? Fixed!
  • Or Acquired
    • Data brought/bought from outside
    • Ingest
  • Development
    • Derived, refined, combined, processed data
    • Potentially many stages

20. SDSS (Visual) TWOMASS (Infrared) Slide from Rajendra Bose 21. Slide from Rajendra Bose 22. New discovery

  • National Virtual Observatory
    • Johns Hopkins press release: Scientists working to create the NVO, an online portal for astronomical research unifying dozens of large astronomical databases, confirmed discovery of [a] new brown dwarf recently. The star emerged from a computerized search of information on millions of astronomical objects in two separate astronomical databases. Thanks to an NVO prototype, that search, formerly an endeavor requiring weeks or months of human attention, took approximately two minutes.

23. Context

  • Data meaningless without context
    • Linkage
    • Metadata of many kinds
    • Workflow!
  • Provenance
    • Computational lineage
    • Authenticity

24. NASA University research group1 research group3 local decision-making body University research group2 Slide from Rajendra Bose 25. Access and re-use

  • Ethics and rights control access
    • Weak in expressing this long-term
  • Collaboration tools
    • Annotation, discussion, review
    • Re-use leading to change and development
  • Publication
    • Not just in print
    • Underlying data should be published, too
  • Citation

26. CLADDIER citation investigation

  • My last example was an MST data set held at the BADC, and I was suggesting something like this (for a citation):
  • < Citation >< Author >Natural Environment Research Council< / Author >
  • < Title >Mesosphere-Stratosphere-Troposphere Radar at Aberystwyth< / Title >
  • < Medium >Internet< / Medium >
  • < Publisher >British Atmospheric Data Centre (BADC)< / Publisher >
  • < PublicationDate status =" ongoing " >1990 < / PublicationDate >
  • < Identifier >badc.nerc.ac.uk/data/mst/v3/upd15032006 < / Identifier >
  • < Feature >< FeatureType > http://featuretype.registry/verticalProfile < / FeatureType >< LocalID > 200409031205 < / LocalID >< / Feature >
  • < AccessDate >Sep 21 2006< / AccessDate >
  • < AvailableAt >< url > http://badc.nerc.ac.uk/data/mst/v3/ < / url >< / AvailableAt >
  • < / Citation >
  • (Made up tags!)
  • Bryan Lawrence Weblog

27. CLADDIER 2: Version of record

  • Role of Publisher: add value
    • provision of catalogue metadata
    • some commitment to maintenance of the resource at the AvailableAt url
    • some commitment to the resource being conformant to the description of the Feature
    • some commitment to the maintenance of the mapping between the identifier [LocalID] and the resource.
  • Bryan Lawrence Weblog

28. CLADDIER 3: persistence

  • Wayback Machine
    • Only snapshots (eg only 2004 version of Bryans home page!)
  • WebCite
    • allows the creater of content to submit URLs for [archiving], thus ensuring when one writes an academic document, the material will be archived, and the citation will be persistent
    • But no real help for data
  • only allow [data citation] when we believein the persistence of the organisation making the data available
  • Bryan Lawrence Weblog

29. 30. Citation

  • Needs a stable resource to cite

OWL Web Ontology LanguageReference W3C Proposed Recommendation 15 December 2003 This version : http://www.w3.org/TR/2003/PR-owl-ref-20031215/ Latest version : http://www.w3.org/TR/owl-ref/ Previous version : http://www.w3.org/TR/2003/CR-owl-ref-2003081

  • (FRBR works & expressions?)

31. Citation

  • The date alone (as in common web citation approaches) is not enough!
    • Cited object likely to have changed
    • Citation should link to the cited object as it was!
  • [6] The CIA World Factbook.
  • www.cia.gov/cia/publications/factbook/.
  • Retrieved on 8 Jan 2006.

32. Citation needs

  • An efficient way to reference and access archived past states of a changing dataset (work in progress, Buneman et al)
  • Not importan