UKOLN is supported by: eBank UK : linking research data, scholarly communications and learning. Dr...
-
Upload
jenna-sharp -
Category
Documents
-
view
216 -
download
1
Transcript of UKOLN is supported by: eBank UK : linking research data, scholarly communications and learning. Dr...
UKOLN is supported by:
eBank UK : linking research data, scholarly communications and learning.
Dr Liz Lyon, UKOLN, University of Bath, UK
JISC CNI Conference
July 2004, Brighton.
www.bath.ac.uk
a centre of expertise in digital information management
www.ukoln.ac.uk
JISC CNI Conference 2004 2
Overview
• Setting the scene: e-Research• The scholarly knowledge cycle
– Data, information and workflows– Provenance
• eBank UK Project– The experience so far– Issues arising
• Challenges for the future
Setting the scene: e-Research
JISC CNI Conference 2004 4
e-Research trends summary
• Increasingly data–intensive, quantitative• Implementing new science • Inter-disciplinary • New disciplines e.g. Astro-informatics• New skills requirements
– IT + statistics + domain
• Collaborative• Highly distributed resources
– Knowledge discovery / extraction
• Open access to data and information – OECD Declaration January 2004
• A changing landscape of scholarly communications
The scholarly knowledge cycle
JISC CNI Conference 2004 6
Research & e-Science workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Searching , harvesting, embedding
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
The scholarly knowledge cycle.
Liz Lyon, eBankUK article. Ariadne, July 2003.
JISC CNI Conference 2004 7
Research & e-Science workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Searching , harvesting, embedding
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
JISC CNI Conference 2004 8
JISC CNI Conference 2004 9
JISC CNI Conference 2004 10
JISC CNI Conference 2004 11
JISC CNI Conference 2004 12
JISC CNI Conference 2004 13
JISC CNI Conference 2004 14
Research & e-Science workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Searching , harvesting, embedding
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
JISC CNI Conference 2004 15
Learning & Teaching workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules
Harvestingmetadata
Resource discovery, linking, embedding
Peer-reviewed publications: journals, conference proceedings
Validation
Resource discovery, linking, embedding
Deposit / self-archiving
Learning object creation, re-use
Searching , harvesting, embedding
Quality assurance bodies
Validation
Presentation services: subject, media-specific, data, commercial portals
JISC CNI Conference 2004 16
Learning & Teaching workflows
Research & e-Science workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Resource discovery, linking, embedding
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Resource discovery, linking, embedding
Deposit / self-archiving
Learning object creation, re-use
Searching , harvesting, embedding
Quality assurance bodies
Validation
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
JISC CNI Conference 2004 17
Learning & Teaching workflows
Research & e-Science workflows
Aggregator services:
eBank UK
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Resource discovery, linking, embedding
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Resource discovery, linking, embedding
Deposit / self-archiving
Learning object creation, re-use
Searching , harvesting, embedding
Quality assurance bodies
Validation
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
The eBank UK Project
JISC CNI Conference 2004 19
eBank UK project
• JISC-funded for 1 year from September 2003• UKOLN at the University of Bath (lead), University of
Southampton, University of Manchester• “Building the links between research data, scholarly
communication and learning”• e-Science testbed Combechem
– Grid-enabled combinatorial chemistry– Crystallography, laser and surface chemistry– Development of an e-Lab using pervasive computing technology– National Crystallography Service
• Resource Discovery Network PSIgate physical sciences portal• http://www.ukoln.ac.uk/projects/ebank-uk/
JISC CNI Conference 2004 20
The project team
• UKOLN• Michael Day• Monica Duke• Rachel Heery• Liz Lyon• +• Andy Powell
• Southampton• Les Carr• Simon Coles• Jeremy Frey• Chris Gutteridge• Mike Hursthouse
• Manchester• John Blunden-Ellis
Comb-e-Chem Project
X-Raye-Lab
Analysis
Properties
Propertiese-Lab
SimulationVideo
Diff
ract
omet
er
Grid Middleware
StructuresDatabase
JISC CNI Conference 2004 22
Crystallography workflow
• Initialisation: mount new sample on diffractometer & set up data collection
• Collection: collect data• Processing: process and correct images• Solution: solve structures• Refinement: refine structure• CIF: produce CIF (Crystallographic
Information File format)• Report: generate Crystal Structure Report
JISC CNI Conference 2004 23
JISC CNI Conference 2004 24
First steps: establishing common ground…
• Understand the data creation process • Terminology and definitions
– Data– Metadata– Datafile– Dataset– Data holding
• Different views– Digital library researchers, computer scientists, chemists– Generic vs specific– Modeller vs practitioner
• Aim for a common ontology• Modelling the domain• Creating a metadata schema
JISC CNI Conference 2004 25
Progress update
• Version 2.0 eBank metadata schema• Enhanced ePrints.org software• Pilot institutional e-data repository for
harvesting (raw, derived, results data)• Exports records as ebank_dc and oai_dc• Validation of schema• Pilot eBank UK aggregator service• Develop search interface Version 1.0 • Testing with PSIgate physical sciences portal
– embedding eBank UK
JISC CNI Conference 2004 26
Some metadata issues
• Using simple and qualified Dublin Core • Additional chemical information in schema for
harvesting e.g. empirical formula• Schema contains International Chemical
Identifier (InChI)• Links to all datasets associated with an
experiment• Links to individual datasets within an experiment• Links to eprints (and other published literature)
derived from the data• Using vocabularies specific to crystallography• Will substitute when standards emerge
JISC CNI Conference 2004 27
ebank_dc record (XML)
Crystal structure (data holding)
Crystal structure report (HTML)
Dataset
Dataset
Institutional repository
Deposit
Dataset
dc:identifier
dcterms:references
Linking
dc:type=“CrystalStructure” and/or “Collection”
Model input Andy Powell, UKOLN.
Eprint oai_dc record (XML)
dcterms:isReferencedBy
dc:type=“Eprint” and/or ”Text”
Data flow in eBank
JISC CNI Conference 2004 28
ebank_dc record (XML)
Crystal structure (data holding)
Crystal structure report (HTML)
Dataset
Dataset
Institutional repository
eBank UK aggregator service
ePrint UK aggregator service
Subject service
DepositHarvesting OAI-PMH
ebank_dc
Harvesting OAI-PMH oai_dc
Harvesting OAI-PMH oai_dc
Dataset
dc:identifier
dcterms:references
Linking
dc:type=“CrystalStructure” and/or “Collection”
Model input Andy Powell, UKOLN.
Eprint oai_dc record (XML)
dcterms:isReferencedBy
dc:type=“Eprint” and/or ”Text”
Data flow in eBank
JISC CNI Conference 2004 29
ebank_dc record (XML)
Crystal structure (data holding)
Crystal structure report (HTML)
Dataset
Dataset
Institutional repository
eBank UK aggregator service
ePrint UK aggregator service
Subject service
DepositHarvesting OAI-PMH
ebank_dc
Harvesting OAI-PMH oai_dc
Harvesting OAI-PMH oai_dc
Searching, linking and embedding
Searching, linking and embedding
Searching, linking and embedding
Dataset
dc:identifier
dcterms:references
Linking
dc:type=“CrystalStructure” and/or “Collection”
Model input Andy Powell, UKOLN.
PSIgate portal
Eprint oai_dc record (XML)
dcterms:isReferencedBy
dc:type=“Eprint” and/or ”Text”
Data flow in eBank
JISC CNI Conference 2004 30
Currently we are……
• Planning Consultation Workshop – August• Developing a demonstrator• Promoting Open Access and Open eData
Archives to international crystallographic organisations, publishers, learned societies
• e-Science All Hands Meeting, Nottingham September 2004.
• Phase 2 proposal funding sought for further 12 months
Challenges for the future
JISC CNI Conference 2004 32
Phase 2 plan…….(1)
• Continue to progress generic data models and metadata schemas
• Validation against other schema– CLRC Scientific Metadata Model vs 1.0 2001 (under revision)
http://www-dienst.rl.ac.uk/library/2002/tr/dltr-2002001.pdf
• Complex digital objects• Investigate packaging options
– METS– MPEG 21 DIDL – ??
• Metadata enhancement - subject keyword additions to datasets based on knowledge of keywords in related publications
JISC CNI Conference 2004 33
Phase 2…..(2)
• Investigate identifiers e.g. International Chemical Identifier (InChI code)– Access to scientific (climate) data using DOIs (German
National Library of Science & Technology)
• Explore context sensitive linking: find me– Datasets by this person– Journal articles by this person– Datasets related to this subject– Journal articles on this subject– Learning objects by this person– Learning objects on this subject
JISC CNI Conference 2004 34
Phase 2…….(3)
• Workflow embedding– Expand to include SMART e-Lab metadata e.g.
sample preparation
• e-Learning embedding and pedagogic evaluation– MChem course – Chemical informatics course
• Expand into other physical sciences• Feasibility study in a related domain -
biosciences
JISC CNI Conference 2004 35
Learning & Teaching workflows
Research & e-Science workflows
Aggregator services: eBank UK
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Resource discovery, linking, embedding
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Resource discovery, linking, embedding
Deposit / self-archiving
Learning object creation, re-use
Searching , harvesting, embedding
Quality assurance bodies
Validation
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
JISC CNI Conference 2004 36
Potential longer term impact
1. Track data, information and workflows in e-research and scholarly communications – knowledge audit??
2. Validate the accuracy and authenticity of derived works – ideas audit??
3. Facilitate explicit referencing and acknowledgment of original contributors – intellectual integrity??
4. Raise standards associated with publication of research outputs – academic publishing rigour??
5. Implement open access to and dissemination of data and information – enhance the research process??
6. Give students links to original data underpinning published works – enhance the learning process??
JISC CNI Conference 2004 37
Thank you.
Questions?…..