IST- 2001-320015 Uwe Müller Humboldt University Berlin OAI-PMH Implementation - Tutorial -
OAI-PMH harvesting metadata and virtual datastream G.Birello, I.Fucile, V.Giovanetti, A.Perin Open...
-
Upload
rodney-lester-oconnor -
Category
Documents
-
view
221 -
download
0
Transcript of OAI-PMH harvesting metadata and virtual datastream G.Birello, I.Fucile, V.Giovanetti, A.Perin Open...
OAI-PMH harvesting metadata and virtual datastreamG.Birello, I.Fucile, V.Giovanetti, A.Perin
Open Repository 2013 - Charlottetown, Prince Edward Island (Canada) July 8 - 12
DIGIBESS repository (economic and social science books of Piedmont Area, Italy) is indexed in international portals OpenDOAR, BASE, IESR, ROAR. The books are indexed in WorldCat Catalogue, PLEIADI, Cultura ITALIA and soon in Europeana.
OAI-PMH
OAI-PMH Data ProviderMetadata Formats: OAI_DC and PICO
#proai.properties...driver.fedora.md.formats = oai_dc pico...driver.fedora.md.format.oai_dc.dissType = info:fedora/*/DC
driver.fedora.md.format.pico.dissType = info:fedora/*/openbess:dc2picoSdef/dc2pico
BOOK ItemRepository exposesOAI-PMH interface for external metadata harvesting.Metadata can be disseminated in two formats: OAI_DC and PICO.OAI_DC are extracted from object datastream DC.PICO are generated on-the-fly by an object service.
OAI-PMH harvesting metadata and virtual datastreamG.Birello, I.Fucile, V.Giovanetti, A.Perin
Open Repository 2013 - Charlottetown, Prince Edward Island (Canada) July 8 - 12
SERVICE DEPLOYMENTopenbess:dc2picoSdep-bookCModel
BOOK MODELislandora:bookCModel
getViewer
DC
RELS-EXT
hasService → openbess:dc2picoSdef
INDEX
TN
Dc2pico
RELS-EXT
hasModel → fedora-system:ServiceDeployment-3.0
isDeploymentOf → openbess:dc2picoSdef
isContractorOf → islandora:bookCModel
WSDL
address location="http://fc1.to.cnr.it:8080/saxon/"
binding verb="GET"
operation location="SaxonServlet?source=(DC)&style=(XSL)&clear-stylesheet-cache=yes“
METHODMAP
DatastreamInputParm parmName="DC"
DatastreamInputParm parmName="XSL"
MethodReturnType wsdlMsgName="response" wsdlMsgTOMIME="text/xml“
XSLXSLT for DC to PICO transformation
REFERENCE
web site: http://www.digibess.it web development: http://dev.digibess.it
An object method is made up of two objects: Service Definition and Service Deployment. Model and Service objects are connected by semantic relationships. Service Definition describes the method. Service Deployment describes how the method is executed: input and output parameters, web service location and static xslt datastream.
SERVICE DEFINITIONopenbess:dc2picoSdef
DC
RELS-EXT
METHOD MAP
hasModel → fedora-system:ServiceDefinition-3.0
Method operationName="dc2pico”
DC
DSInputLabel → DC
DSInputLabel → XSL (pid="openbess:dc2picoSdep-bookCModel")
DSINPUTSPEC
Is this your hardware plan?
1. Buy expensive hardware.
2. Pay expensive annual support fees.
3. Worry that you’ll outgrow your hardware.
4. Save up to buy more hardware in 4 years.
You need a new plan. (So did we!)
Digital Repository Infrastructure: Rent or Buy?
Robin Dean & Ed FugikawaAlliance Digital Repository
Colorado Alliance of Research Libraries
Save MoneyImprove Performance
Prepare for Growth
Clever CrosswalkingStarting point• Repository: mainly theses• Research management system
Challenges• Different metadata granularity• Good repo workflows but RMS now
primary data source
Y. Zhao, K. Shepherd, L. Hayes – The University of Auckland, New ZealandA. Schweer – Library Consortium of New Zealand
Integrate repository & RMS(once-off sync, then ongoing)
Bending the rules without breaking the repo: Using free
RDF description in Fedora Commons repositories
Adam SorokaUniversity of Virginia
Backend
Repository Cloud Service in JapanCRUD
SWORD 2.0
Search
OpenSearch
Society Copyright Policy KBEmail alert or
SWORD deposit
Researcher CV Platform
Open Repository
200000+ Users
A mash-up of a Japanese Open Repository and a Researcher CV Platform
Institutional Repositories
Crowdsourcing HCI for the institutional
repository
Stephanie Taylor, Critical Eye CommsEmma Tonkin, University of Bristol
Learn about the NEW hosted service from DuraSpace!
And how you can save yourself time by letting DuraSpace handle managing your DSpace repository software.
DSpaceDirect includes: Repository quick-start
You-pick features No-cost upgrades
Content safeguards Anytime data access
There will be aliens! And roses, too!
We propose that a new metadata/annotation can be collected from access logs of a HTTP server of IR.The access logs of the IR from search engine contain search queries that relate to the contents of IR.A logger program sends the obtained metadata to IR using SWORD.
Automatic reproduce metadata from the log of HTTP server
HTTP Server Log
Log analysisqueryResource id
New metadata
SWORD
RepositorySystem
Resource metadata
Toshihiro Aoyama, Yuta Suzuki, Kazu Yamaji
P-CUBE Major Modules
Object RelationsWhat is P-CUBE ? P-CUBE Data Model Architecture
P-CUBE Lifecycle Actors’ Role
Suntae Kim ([email protected])
Redirecting Web service for ORCID to scholarly systems
via the Researcher Name Resolver
Kei Kurakawa and Hideaki Takeda
National Institute of Informatics, Japan
National Researcher identifier
International Researcher identifier
National scholarly systems
Campus Directories
Challenge to Data-intensive science: cooperation of metadata database for upper atmospheric
research and author IDYuki. KOYAMA* et al.
*World Data Center for Geomag., Kyoto & Graduate School of Sci. Kyoto Univ.
DOI
ORCID
with Role
(e.g., PI, Archive Spefialist)DataCite
Japan Link Center (JaLC)
Granule
Client based interface and proxy server for content re-
use framework based on OAI-PMH
Takao NamikiHokkaido University
FEDORA COMMONS BASED FRAMEWORK FOR AGGREGATION, REUSE AND DISSEMINATION OF THE DIGITAL CONTENTMartin LhotákLibrary of the Academy of SciencesCzech RepublicOpen Repositories 2013
FEDORA COMMONS BASED FRAMEWORK FOR AGGREGATION, REUSE AND DISSEMINATION OF THE DIGITAL CONTENT
- open source system for a digital library
- digitization workflow monitoring system
- digital document production and archiving system
http://www.czechdigitallibrary.cz
Collaborative repository to support food and feed
safety risk assessment in Europe
Jane Richardson, Lara Congiu, Cristiano Morganti, Patrizia Pirro, Elisa Aiassa, Sadia Noorani,
Diane Lefebrve, Didier VerlooEuropean Food Safety Authority
Defiant Objects
Managing non-standard research outputs in institutional repositories
www.sherpa-leap.ac.uk
Giving them what they want: Using Data Curation Profiles to guide Datastar developmentSarah Wright1, Dianne Dietrich1, Huda Khan1, Wendy Kozlowski1, Leslie McIntosh2, Gail Steinnhart1. 1: Cornell University, 2: Washington University in Saint Louis
The ability to apply standardized metadata from your field or discipline to the dataset.
The ability of the general public to easily find the data set.
Documentation of any and all changes that were made to the dataset over time.
A requirement that others cite the data set if they were to use it in their research.
The ability to enable version control for the data set.
The ability to track data citations.*
The ability to cite the dataset in my publications.
The ability for people to easily discover the dataset using Internet search engines.
The ability to create a basic, public description of (and provide a link to) my data.*
0 1 2 3 4 5 6 7 8
High priority Medium priority Low priority Not a priority I Don't Know or N/A
What did they want?
AN INVESTIGATION INTO JOURNAL RESEARCH DATA POLICIES
jord
The Findings of the JoRD Project
Azhar Hussain, Marianne Bamkin, Paul Sturges*, Jane H Smith, Bill Hubbard
Centre for Research Communications, University of Nottingham* Loughborough University
OR2013
AN INVESTIGATION INTO JOURNAL RESEARCH DATA POLICIES
jord
What is the JoRD project?
• Scholarly journals are increasingly recommending or requiring as a condition of publication that research data should be made available in an appropriate repository
• Different journals, different requirements and recommendations • JoRD was a 6 month feasibility study (July-Dec 2012) commissioned by JISC
• Tasked to scope the shape of a potential service to collate and summarise journal research data policies to provide an easy source of reference to understand requirements and recommendations made by journal editorial boards with regard to data sharing
Wendy WatkinsErnie Boyko
Carleton UniversityCANADA
Introducing new technology is 20% tech and 80% culture.
Physics and Astronomy (comPADRE
)
Customizing STEM Instruction with Educational Digital Libraries Open Repositories
2013
District
Teacher
Students
Student Stude
nt
Group of
Students
Students
Publisher
Materials
(Purchased
by Districts)
Teachers’
private
materials
Shared materials among teachers
Digital Library for Earth System
Education (DLESE)
Other Resourc
es
National Science Digital Library (NSDL)
Students
Student Stude
nt
Group of
Students
Students
Physics and Astronomy (comPADRE
)
Customizing STEM Instruction with Educational Digital Libraries Open Repositories
2013
National Science Digital Library (NSDL)
Digital Library for Earth System
Education (DLESE)
District
Teacher
Teachers’ private
materials
Shared materials among teachers
Publisher Materials
(Purchased by Districts)
Other Resourc
es
NSDLPublish
er
Materia
ls
DLES
E
Shared
materials
Publish
er
Materia
ls
Teacher
s’
private
materia
ls
NSDL
Shared
materials
The Curriculum Customization Service
Link it or Don't Use It
Transitioning Metadata to Linked Data in Hydra
Karen EstlundHead, Digital Scholarship CenterUniversity of Oregon Libraries
Tom JohnsonDigital Applications Librarian
Oregon State University Libraries
The Repository as Data (Re) User: Hand Curating for Replication
Yale UniversityInstitution for Social and Policy Studies
Limor Peer
How does the ISPS Data Archive
re-use data?
How does replication
drive curation at the ISPS Data
Archive?
http://livelymorgue.tumblr.com
http://reusesymbol.maker.good.is/projects/CedricCummings
From this…
To this…
Phase One of CED2AR Comprehensive Extensible Data Documentation and Access Repository
Block, WilliamLagoze, CarlBrown, WarrenWilliams, JeremyVilhuber, LarsAbowd, JohnArguillas, Florio
Open Repositories 2013Repository Island Charlottetown, PEI, Canada July 8-11, 2013
Cornell Institute for Social and Economic ResearchA LEADER IN SOCIAL SCIENCE AND DATA COMPUTING
• $3 million spread over 5 years• CED2AR is a metadata repository that integrates metadata
of multiple versions and derivatives of datasets produced or managed by the U.S. Census Bureau that reside in public and/or restricted spheres.
• Phase one addresses the challenges faced by the NSF-Census Research Network (NCRN) in terms of integrating metadata from disparate sources, such as format disparity, schema disparity, and sparseness of metadata• See our solution for syncing metadata records from restricted and public-use datasets• See how we standardized disparate metadata resources (such as metadata from SSB,
IPUMS, and ACS) • See how we switch on and off confidential metadata at the variable and value label• See our user interface for searching variables across datasets
Open Repositories 2013, Repository Island, Charlottetown, PEI, Canada July 8-11, 2013
Development of the Health Research Data Repository and the TREC Longitudinal Monitoring System
James Doiron, Manager, HRDR , University of Alberta, Canada
Open Repositories 2013Charlottetown, PEI
Health Research Data Repository (HRDR)Based in the Faculty of Nursing, University of AlbertaSecure virtual environment for housing and managing
health research data and metadata throughout their lifecycle
Supports and promotes health research and multi-disciplinary collaboration
Provides secure remote access to data and a regular suite of analytic software
Clearly operationalized policies and procedures, with user orientation process
Operates on a ‘minimal cost recovery’ basisPromotes and provides educational opportunities
regarding data management based on best practicesPromotes secondary use and re-purposing of research
data
Holistically Preserving and Presenting Complex Research
Data
Uploading
Modeling
Storage
Accessing
Using RDF
Researchers submit their research data in whatever form it is captured, recorded, or saved. The file structure can be complex and deeply nested. All file formats are accepted.
The file structure is captured and stored in a METS Structural Map. The Structural Map stores the directory names, system identifiers to the files and the file hierarchy n-levels deep.
At this point the files can be stored simply and the underlying storage mechanism does not
need to preserve the original resources hierarchy or directory structure.
Using Fedora the files are stored as separate datastreams and are
managed as a single object.
Managing the resource as a single object reduces the
burden on the researcher of providing item-level
cataloging.
If the resource does require item-level
cataloging the RUcore metadata model does
support source, technical, and event-based metadata.
Chad Mills - [email protected] Document
When the resource is accessed through a public
interface the existence of the Structural Map indicates that
it is a complex resource. Using the Structural Map, JSON and
JavaScript an interface is rendered that allows the users to select all or
parts of the resource for downloading.
Once selections are made a package file, TAR/ZIP, is generated and streamed back to the
user. The package file follows the BagIt file packaging developed by the Internet Engineering Task Force.
Research project resources are related to the project using RDF statements. Using similarconcepts an entire project can be packaged and downloaded based on the projects RDF statements. Whena Structural Map is detected it isused on that portion of the package.
The example screenshot in the middle is the user interface when downloading an entire research project from RUcore.
Addressing Impediments to Reuse – The Open Folklore Portal
James Halliday (Programmer/Analyst)Julie Hardesty (Metadata Analyst)
Jennifer Laherty (Digital Publishing Librarian)Garett Montanez (Lead Web Architect)
1Open Repositories 2013
Nikos Kasioumis, CERN, Digital Library Services
Open Repositories 2013
Nikos Kasioumis, CERN, Digital Library Services