Two little talks CrossRef Membership Meeting November, 2004.
-
Upload
kristian-rodgers -
Category
Documents
-
view
214 -
download
1
Transcript of Two little talks CrossRef Membership Meeting November, 2004.
Two little talks
CrossRef Membership MeetingNovember, 2004
* Appropriate copy issue
* Some ruminations on digital preservation
Appropriate copy issue…
Talk One
A reminder
“Appropriate copy” problem is about which copy a user is
directed to
Any old system
CitationDOI
Step1
Step2 DOI Resolver
DOI
URL
Cited article
Search response
RepositoryURL
Article
Step3
DOI resolution
CLICK
But – what if more than 1 copy exists?
• Elsevier journals, for example, are on-line at:– Elsevier ScienceDirect– OhioLink– University of Toronto
Which URL?
DOIResolver
DOI
URL?
Sciencedirect.com?
Ohiolink.edu?
Utoronto.ca?
The APPROPRIATE copy
When more than 1 copy exists, specific populations frequently have the right to access specific copies
DOI localization
• Architecture created by CrossRef, CNRI, some publishers, and group of digital librarians
• Implemented in 2002
Any old system
CitationDOI
Step1
Step2 DOI Resolver
DOI
Search response
Localization architecture
CLICK
DOI proxyDoes user havelocalization?
Locallink
server
Y
N
Redirect resolutionfor local decision
making
Local link servers
• Directs user based on local business arrangements
• Can provide rich services– the right digital copy, a paper copy, other works
by the author…
• Also provides a place in the architecture to insert proxies for off-campus users
• Now widely implemented and heavily used
Local link serving is VERY popularSFX Requests per Month
2004
0
10000
20000
30000
40000
50000
60000
1 2 3 4 5 6 7 8 9 10
Requests
A new concern
(and CrossSearch…)
Google – what happened to the DOI?
Most journal article linkslook like this!
Viewed 40 CrossSearch results pages to find a DOI…
The problem…
I clicked this
and got…
But Harvard subscribes!
Frustration!
Just as we’ve gotten local linkingto work with A&I services, journal
references, and the DOI in general…
publishers are filling Google with direct links to their copies!!
Talk 2
Some ruminations on digitalpreservation
Role of publishers in digital preservation?
After years of talk, this remainsmurky, very murky…
but it is certain that “none” is not the answer!!
1. My most important point
Cost and effectiveness of preservation is determined at or
near the point of creation
Think up front
* about format
* about metadata
* about quality
Format
Formats vary significantly in “preservability”
Format
• Some criteria (from Library of Congress)– disclosure (how well documented?)– adoption (how widely used?)– transparency (is compression used?)– self documenting (good!)– external dependencies (self sufficiency is good)– patents (could limit preservation actions)– encryption (what if decryption key is not available?)
Different formats for different purposes
* archival master
* production master
* use copy
Metadata
• The basis of decision-making for preservation– technical metadata
• what format is this in
• what format options are used
– structural• if I change this, what else is affected?
– administrative• who has the right to make decisions about this?
Metadata
– relationships• are there other versions of this object?
– how do these affect my preservation strategy?
– provenance• where did this come from?
• what changes has it already undergone?
Key difference between preservationrepositories and content management systems
Quality
If that archival version is bad when youput it on the shelf, it will still be bad
10 years later when you need it…
and it will be hard to go back to the creator at that point!
2. There is a LOT happening in the domain
…are you watching?
Preservation initiatives
• OAIS “Open Archival Information System” reference model– Formal, structured model for designing digital
preservation archives– ISO standard
• PREMIS (PREservation Metadata: Implementation Strategies)– Define core metadata by end of year– Survey of current practices just published
Initiatives…
• Format registry– Definitive sources of description for technical
formats– community effort to share effort of documenting
digital formats
• RLG/NARA Digital Repository Certification Task Force– recommend structure and metrics of an international
process for certifying preservation repositories
Initiatives…
• JHOVE (JStor/Harvard Object Validation Environment)– Open source tool to identify format of an object,
generate technical metadata from an object, test to see if object is well-formed
• Library of Congress NDIIPP– Define a shared national program of digital preservation
– Well funded: $100M from Congress, $75M matching contributions
NDIIPP national preservation grants
• Web archiving (California Digital Library)• Geographic information
• UC Santa Barbara• North Carolina State
• Digital television (Educational Broadcasting Corporation)• Digital archives (Emory)• Selection for preservation (U Illinois)• Business history (U Maryland)• Social science data sets (InterUniversity Consortium for
Political and Social Research)
Other NDIIPP grants
• Repository interoperation (Stanford, Johns Hopkins, Harvard, Old Dominion)
• Architecture and tools (Los Alamos National Laboratory)
• Research in digital preservation (together with National Science Foundation)
Major programs abroad
• National Library of Australia
• British Library– and a UK national Digital Preservation
Coalition
• Koninklijke Bibliotheek (National Library of the Netherlands)– major digital preservation research program
3. Think of 50 years, not 5 years
The questions are different:* discontinuous technological change
* loss of “common knowledge”* very antique formats
Thus the need for deep documentation and metadata…
4. So many things to preserve
• GIS, survey and economic data, visual resources, research datasets, web stuff, institutional records, faulty papers, audio & video, visualizations, blogs, newsletters, etc.
• Setting priorities– fleeting things demand immediate attention
• “the web”…
– attend to your own house first• faculty output, library digitization, institutional records
A lot to do….
Where does the formal literaturefit in setting priorities?
What will be the role ofdigital copyright deposit?
5. Paying for a common good
• Only one or a few institutions need to archive a given resource
• Two related questions– motivation: why would you not wait until the other follow does
it?– if I do it, can I get others to share the cost?
• Digital is different than paper– Costs of preservation more apparent– Possibility of remote access means you don’t have to do it locally
• Fundamental question, now topic of research– NSF digital preservation grant program– OCLC research paper:
• Brian Lavoie, The Incentives to Preserve Digital Materials
6. LOCKSS is not preservation
• LOCKSS ignores most of the key issues– format– metadata– management– reformatting– repository…
• LOCKSS is great technology for distributed replication, but does not truly address preservation
7. “Hand-off” is a critical component
• What happens if there is one archival copy, and the repository gives up responsibility?– priorities change, institutions come and go…
• Handing off responsibility is a repository’s final preservation action
• How does this relate to publishers?
Lastly…what about preserving e-journals?
• Well, we have the KB, maybe the JStor archive, and LOCKSS(?)…
• Some movement on national digital copyright deposit
• The library/publisher dialog of a few years ago needs to be re-invigorated!
• In the mean time, publishers are hopefully paying attention…