Post on 10-Jan-2016
description
Digital Author IdentificationUKSG 17 – 18 april 2007
Daniel van Spanje
2
DAI in DARE
• DARE: Digital Academic REpositories
– Universities + KNAW + NWO + KB– Infrastructure for linking the IR– Stimulate production of digital scientific
output– 2003 – 2006
• 2007 – 2010: SURFshare
3
Main issues in DAI
• Unique identifying number for researchers / authors
• National scale
• Benefits:– Improve searching for electronic publications– Integrate searching for electronic and non-
electronic publications– Link Library (Catalogue) and research
environment (Metis)
4
Two projects
• Pilot in 2005 – 2006– one university: Groningen
• Roll-out 2006 - 2007– 13 academic research organizations
• Project leader: Anneloes Degenaar
• DAI website at University of Groningen:– http://dai.weblog.ub.rug.nl/– http://dai-uitrol.ub.rug.nl/
5
Organizations involved in DAI• 13 universities + CWI + KNAW
• SURF
• UCI
• OCLC PICA
6
Systems involved
• Institutional repository / DAREnet
• Metis
• Dutch Union Catalogue (NCC/PiCarta)
7
Institutional Repositories /
DAREnet
8
Institutional Repositories /
DAREnet
9
METIS
10
METIS
11
METIS
12
National Union Catalogue
13
Shared Cataloguing
System (GGC)
14
Shared Cataloguing
System (GGC)
15
Names and other issues• Authors with the same name• Use of one or more initials• Changing names • Spelling variants• Diacritics• Pseudonymes • Name in religion• Nicknames• Collective names• Different structure of names in other languages and
cultures• …..
• Discussions on standardization and unification started in the Netherlands in the Orion project (2003-2004)
16
Proposed solution
• Need established
• “External”Requirements:– use existing mechanisms– local management – national function
• Solution: use “collocation” mechanism of libraries and Metis as source
17
Cataloguing and Metis
Cataloguing Metis
GGC RepositoryNTA
18
Use authority records (NTA) in Metis
Metis
GGC Repository
NTA
Cataloguing
CWI
19
How did we link
• Mechanisms– Initial load per organization
– Online input buttons (webtemplates)
– XML output
– Synchronization mechanisms
• Requirements– No overwrite of library data!
– Deduplication (Matching/merging)
20
Datamodel developed
• Datamodel copied from bibliographic model: three levels
• Metis name-information added to library data; no overwrite
• Affiliations and other fields added
21
Structure of bibliographic data
Bibliographic metadata YoP / LoP / / Title / Author
Imprint / LCSH / DDC
Groningen bibdat:Subject headings
Amsterdam bibdatSubject headings
Copy level: •Location•holding•shelfnumber
Copy level: •Location•holding•shelfnumber
Copy level: •Location•holding•shelfnumber
Copy level: •Location•holding•shelfnumber
Linked Authorityrecord
genera
llo
cal
copy
22
Structure of authority data
Thesaurusrecord Name of authorVariant names
Groningen data(Metis name)
Amsterdam data(Metis name)
Affiliation•Begin•End
Affiliation •Begin•End
Affiliation•Begin•End
Affiliation •Begin•End
Linked Authorityrecord
Libra
ry
reco
rdM
etis
Affi
liatio
n
23
Example authority record + added
fields
Library data
Metis Researcher Name
Affiliation data
24
Example authority record + added
fields
25
Datamodel: fieldsAuthority file• Nationality• Language• Name (best known)• Name (most complete)• Maiden name• Name variants• Date of birth • Date of death• Profession / subject• Link to pseudonyms• notes• Entry date• Update date
• Note: proper name field includes subfields for first name, middle name, last name, prefix, suffix
Added fields
• Local researcher number• Metis name (preferred)• Metis name• Sex
• Code organisation• Name organisation• Start date employment• Enddate employment• Code function• Description of function• Code of employment• Notes• Entry date• Update date
26
Initial loadMetis makes list of names
Manual dedup of list
Dedup in Metis
Make Metis export
Format conversion
Load B-records (? Duplicates?)
Export DAI’s to Metis
Manual dedup by library staff
Load DAI in Metis
Load new names(not found)
Merge names with names found
Match names with auth file
27
Initial load
• Data enrichment in Metis• Export from Metis• Conversion to cataloguing system• Matching• Merging: merge / new / B-record
• Results depend on quality metadata– 95 % automatic / 5% manual– 70% automatic/ 30 % manual.– 50 % automatic / 50 % manual
28
Online process• DAI-button in Metis to create DAI-number
• Export DAI-button in NTA/Cataloguingsystem to Metis
• DAI-button in IR to create DAI-number
• Separate DAI-http-request for online input
• Online input via current cataloguing tool
• + Offline synchronization mechanisms between Metis and NTA
29
DAI-button in Metis
30
URL link instead of button
• http://www.pica.nl/dai/dai_redirect.php?action=maak_dai&user=<usernumber>&metis_export_url=http://oras.service.rug.nl:1111/metisdad&p_onderzoekernummer=00033&p_naam_medewerker=Rotteveel&p_voorletter=R&p_voorvoegsel=&p_titulatuur=&p_voorkeur=J&p_geslacht=M&p_geboortedatum=01-07-1974&p_code_functie=20&p_functie=Universitair%20hoofddocent&p_code_organisatie=22020200&p_organisatie_a=Medical%20Microbiology&p_begin_aanstelling=01-01-2005&p_einde_aanstelling=01-01-2006
31
Input form for Metis fields
32
Results of the DAI project
• Now:– 50% of the researchers have a DAI– Procedure for initial load in place– Start with online procedure – P rivacy statement
• Autumn 2007– Online procedure in place– Procedure for synchronization in place– 100% of the researchers will have a DAI in 2007 (ca.
40.000)
33
Things to do
• Finalize the roll-out, develop services (passport …) and implement a usergroup
• Add DAI in metadatastandards (DCX, MODS)
• International standardisation: ISPI
• Involve authors for controll and updating
34
Concluding remark
35
• Thanks