UT Research Data Repository Chris Jordan UT Research Cyberinfrastructure Storage Committee Chair.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
-
Upload
bruce-gilbert -
Category
Documents
-
view
213 -
download
1
Transcript of The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
![Page 1: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/1.jpg)
The Digital Preservation Network at UT Austin
Chris JordanTexas Advanced Computing Center
![Page 2: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/2.jpg)
DPN Member Repository DPN MEmber
DPN Member
DPN Member Repository
DPN Member
DPN Member
DPN Member
DPN Member
Reposiitory
What Is DPN?
DPN Member DPN Member
57 member organizations cooperatively investing in long-term, scalable, digital preservation
![Page 3: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/3.jpg)
Preservation System
Preservation System
Preservation System
DPN Member Repository DPN Member
DPN Member
DPN Member Repository
DPN Member
DPN Member
DPN Member
DPN Member
Reposiitory
What Is DPN?
DPN Member DPN Member
technical staff and systems from 5 large scale preservation repositories
![Page 4: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/4.jpg)
Preservation System
Preservation System
Preservation System
DPN Member Repository DPN MEmber
DPN Member
DPN Member Repository
DPN Member
DPN Member
DPN Member
DPN Member
Reposiitory
What Is DPN?
DPN Member DPN Member
…working groups of experts in succession rights, business services, communications and research data…
![Page 5: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/5.jpg)
DPNNode
DPNNode
DPN Node
What is DPN?
All building a digital preservation backbone for the academy
![Page 6: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/6.jpg)
What Does DPN Do?
1. Establishes a network of heterogeneous, interoperable, trustworthy, preservation repositories (Nodes)
2. Replicates content across the network, to multiple nodes
3. Enables restoration of preserved content to any node in the event of data loss, corruption or disaster
4. Ensures the ongoing preservation of digital information from depositors in the event of dissolution or divestment of depositors or an individual repository
5. Provides the option to (technically and legally) "brighten content" preserved in the network over time
![Page 7: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/7.jpg)
Initial DPN technical partners
Initial DPN launch will feature five nodes: • Academic Preservation Trust (APTrust)• Chronopolis• HathiTrust• Stanford Digital Repository (SDR)• University of Texas Data Repository (UTDR)
And a participating partner:• DuraSpace
![Page 8: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/8.jpg)
DPN, UT and TDL
• TACC & TDL have an established partnership• TACC also collaborates with UT Library on:– Data Management Planning– Local research support– HPC for Digital Libraries
• DPN extends these efforts to include design and implementation of a replicating node
![Page 9: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/9.jpg)
What is UTDR?
• UT Research Cyberinfrastructure Initiative• Supports all 15 UT System Schools with:– High Performance Computing– 10Gb Research Network– 5PB Replicated Data Repository
• Deployed in early 2012, now over 100 investigators, 100s of users, over 1PB allocated
![Page 10: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/10.jpg)
TACC Capabilities
• Corral UTDR System – 5PB, geographically replicated online storage
• iRODS Data Management, Databases, Web applications
• Ranch – 100PB+ Tape Archive capacity• Additional data-intensive systems this year• Stampede/Lonestar/Longhorn– World-Class Supercomputing and Visualization
![Page 11: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/11.jpg)
DPN Network Concepts
• “First Nodes” submit data packages• “Replicating Nodes” hold copies of data• Messaging framework and Registry track data
submissions and replicas• “Bags” are used to package data for
preservation – contents are opaque to DPN• Each node provides its own interfaces
![Page 12: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/12.jpg)
DPN Design Principles
• Nodes should be as independent as possible• Content owners should have control over
format of data• Network should be flexible – easy to add and
remove nodes• Diversity of implementation is crucial to
successful long-term preservation
![Page 13: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/13.jpg)
Components in Technical Architecture
• Messaging infrastructure to support federated services • Registry to track objects within the federation, including
copies, version, rights, brightening information • Transfer mechanisms (rsync, https, gridFTP, etc.)• Private PKI for securing transport layers• Logging and reporting• Other components we implement separately, but may be
common, for example a secure transfer area.• DPN objects that hold administrative content such as DPN
framework agreements, DPN bagit profiles, versioned Brightening information for a collection/repository
![Page 14: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/14.jpg)
TDL and DPN
• In DPN terms, TDL is a content provider and “first node”
• TDL retains primary responsibility for data• DPN provides a backup function for
institutional, technical, or other failures• TACC provides storage for both TDL and DPN– Data packages will be separate– Content packaging will be different
![Page 15: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/15.jpg)
UT DPN Implementation
• UT Library, TACC have significant presence in DPN leadership teams
• Participation in technical, sustainability, other DPN working groups
• Library will provide interfaces to TDL and other local repositories
• TACC will provide back-end storage and other implementation components
![Page 16: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/16.jpg)
Other Repositories and DPN
• DPN is effectively a “dark archive”• Repositories still must have their own
solutions for access/data management/etc• But DPN can provide preservation functions• If you are a DPN member and can generate
“bags” you can deposit data into DPN• Many institutions are already DPN members• Membership is open but fee-based
![Page 17: The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.](https://reader036.fdocuments.in/reader036/viewer/2022072005/56649cef5503460f949bd5c9/html5/thumbnails/17.jpg)
The DPN Technical TeamAPTrust
Scott TurnbullTim SigmonAdam Soroka
ChronopolisDavid MinorMike SmorulDon Sutton
DuraSpaceAndrew Woods
HathiTrustSebastien KornerBryan Hockey
Stanford Tom CramerJames Simon
Texas Data RepositoryLadd Hanson
Christopher Jordan