Oliver Spits Out a Finding Aid Using CONTENTdm with a Database Susan Hamburger, Ph.D.
description
Transcript of Oliver Spits Out a Finding Aid Using CONTENTdm with a Database Susan Hamburger, Ph.D.
1
Oliver Spits Out a Finding Aid
Using CONTENTdm with a Database
Susan Hamburger, Ph.D.Penn State University LibrariesSociety of American Archivists, August 30, 2007
2
Background
“Oliver” homegrown Oracle platform database Filemaker Pro → MS Access
Rudimentary export of container list No biog/hist, scope and content narratives
→ SQL → Oracle Merging of five databases into one Working on exporting EAD-tagged data
Finding aids created individually with XMetaL From scratch Stitch together database export of container info plus MS
Word for narrative
3
Questions for Discussion
How can we automate generating EADs from Oliver?
How can we provide a federated search tool for finding aids?
What software is out there to use?
4
Considerations
Special Collections staff doesn’t know EAD New Processing Coordinator knows EAD Manuscripts Cataloger creates EAD finding
aids Library’s Information Technology (I-Tech)
staff doesn’t know EAD, barely literate with XML and XSLT
Library Dean doesn’t want I-Tech to do development → find out-of-the-box solution
5
Ease of input of finding aid: Deal breaker, Priority 1 (when we go online), Priority 2 (within two years)
Search multiple or single EAD fields including ALL of the following Unittitle, persname, corpname, formgenre, famname, subject, scopecontent, bioghist, unit within Special Collections
Deal breaker
Search results will display the EAD fields: unittitle, unitdate, extent, biography, abstract Deal breaker
link to an outline view Deal breaker
link to full/print view Deal breaker
highlight the keyword in context in that display Deal breaker
Display finding aids using XSLT in outline and full text view with keyword highlights Deal breaker
Single and batch input Priority 1a
Search, results display, and finding aid display should be customized/customizable to suit Special Collections
Priority 1a
Moderate to high performance in speed, usability and site navigation equal to The CAT Priority 1b
Full text searching Priority 1b
Mark results list for bookbag Priority 1c
Mark results list for email Priority 1c
Mark results list for print Priority 1c
User can sort results by: frequency, author, title, date of collections Priority 1c
Refine search—means performing an additional search on the current set of records Priority 2a
Paraprofessional input Priority 2a
6
Search results will:
highlight all keywords in context on a separate frame Priority 2b
Display large result sets by alphabetical chunks Priority 2b
display the finding aid’s file size Priority 2c
Search the date field separately Priority 2c
Ease of input of finding aid:
Deal breaker, Priority 1 (when we go online), Priority 2 (within two years)
7
Task Force
Finding Aid Platform working group formed
3 from I-Tech, 2 from Special Collections, Manuscripts Cataloger, 1 from Digital Libraries Technology (DLT)
Charge: To find and evaluate existing products that meet our criteria and make recommendation for implementation
8
Methodology
Survey marketplace Informal queries
Society of American Archivists annual meeting RLG conference Posting the question on the archives listserv Searched academic libraries’ websites
The survey results consisted of five potential products: Archeon Archivists Toolkit CONTENTdm v. 4.2 DLXS v. 12 XTF
9
Methodology
Assess and evaluate the products and determine costs Create a comprehensive set of prioritized
criteria for search and display and compatibility with the Library computing environment
Two-member groups evaluated products against the criteria
Populate evaluation matrix
10
Prioritized Criteria List
Graduated criteria: 1 = Required to 6 = Desired Back End
2 Supports Unicode 2 Ability to load full and minimal finding aids 2 Easily customizable end user output/display
Specifications 5 Back end user tools for data load, maintenance 1 Product support Licensing issues?
11
Prioritized Criteria List
Graduated criteria: 1 = Required to 6 = Desired Rights Management
3 Authorization at collection level and field level Search Functionality
3 Full text searchable across finding aids as a whole 3 Keyword searchable across multiple, selected fields 3 Search across all collections in system or across pre-
determined subsets of collections 2 Browse collections 3 Search format and index terms 6 Search by date 3 Persistent navigation (prefer static outline view while scrolling
through finding aid) 4 PURLs to individual finding aids
12
Prioritized Criteria List
Graduated criteria: 1 = Required to 6 = Desired End User Output
Export output/download METS/MODS/Dublin Core 1 Output includes both outline view and full view 1 Search term highlighted in results list (brief) and full finding aid view 1 Search results display 4 EAD fields: Unittitle, unitdate, extent,
abstract 6 Large result set, represented in alpha list as intermediate navigation
rather than number ranges (e.g., A|B|C …) vs. (1-300, 301-500, etc.) 4 Results sorted by relevance, and author, title 6 Results sorted by date 4 Save marked list from result set 4 Print, review, email, etc. from marked list 3 Refine search from results list 6 Display finding aid file size 6 ADA Compliant (AD54)
13
Prioritized Criteria List
Other desirables, not prioritized Discovery/Sharing
OAI Harvesting Findable/crawlable by RLG spiders, etc. Findable by Google, etc. Compatible within Course Management tools Supports inter-institutional sharing of collections/items,
etc. Individual contributions of material to library collections
(p2p-like) Federated search support Ability to add link to CAT record from Finding Aid
metadata
14
15
Evaluation of Software
Archeon and Archivists’ Toolkit did not meet critical search and display criteria
XTF did not meet criteria for technical support DLXS met all criteria established for a search and display, but
would require significant local development to meet criteria for back-end dispersed processing of finding aids (e.g., non-technical staff at any location can process material)
The current version of CONTENTdm v.4.2 would require significant local development to accommodate the large Special Collections finding aids
Discussion with the developer at CONTENTdm revealed that an improved version CONTENTdm which fully supports XML ingest, indexing, output and large field sizes is in development and will be announced this summer
16
Recommendations
Continue to develop the export function to generate valid EAD finding aids from Oliver
Participate in the development of CONTENTdm v.n βeta
Evaluate CONTENTdm v.n βeta at its production release against defined criteria
If the production release meets our criteria Implement CONTENTdm as our production system by January
2008 If the production release does not meet our criteria
Recommend a revised investigation of existing/new products
17
Projected Plan
Work with CONTENTdm in the βeta trial of CONTENTdm v.n and launch digital finding aids in January 2008 at the latest
Timeline: February – July 2007 Create Best Practices Guidelines Clean up data in Oliver Develop export tools to generate EAD finding aids Develop XSLT stylesheets
July – November 2007 Work with CONTENTdm on development and βeta testing of new
release Submit our list of criteria to CONTENTdm as they initiate their
development January 2008
Launch next release of CONTENTdm and put finding aids into production or revert to backup plan
If βeta version fails to meet expected timeline or criteria, especially for ingest, XML mapping, and large field size, platform project evaluation team will confirm these circumstances with OCLC and reevaluate available platform products against existing criteria and recommend to IT Priorities
18
Ongoing Development
Usability testing Continued modification of output style
sheets in response to usability testing recommendations
Regular scan of marketplace to monitor new products
19
Resources Needed
Staffing Implementation team with representatives
from I-Tech and Special Collections to move this plan forward
Digital Library Technologies support will be required if the βeta release is available to be installed locally on a development server
20
Resources Needed
Training Oracle 10g: XML Fundamentals training to
support the work involved in extracting EAD2002 XML finding aids out of the Oliver database
XSLT refresher training may be needed for I-Tech personnel
21
Conclusion
Because of infrastructure and policies, we had to select product with least amount of customization and programming DLXS is hard to ingest, but looks good and
functions well CONTENTdm is easy to ingest, but doesn’t have
functionality required for finding aids Open source software requires dedicated staff
with expertise we don’t have
22
Recommendations
Determine your needs Systematically evaluate products Have a timeline goal for decision
making Know your technical limitations Include key personnel in planning
23
Contact
Susan Hamburger, Ph.D.The Pennsylvania State University Paterno LibraryCataloging and Metadata ServicesUniversity Park, PA [email protected] 814-863-7293http://www.personal.psu.edu/sxh36/
24
Here are Oliver and some documents
Oliver database Special Collections finding aids Web
pages http://www.lias.psu.edu/speccolls/FindingAids/findaids.htmhttp://www.lias.psu.edu/speccolls/FindingAids/subjectlist.html
http://www.lias.psu.edu/speccolls/FindingAids/american.html
http://www.lias.psu.edu/speccolls/FindingAids/ohara.frame.html
Finding Aids Platform Product Details [MS Word → HTML document]
25
26
27
28
29
30
Finding Aids Platform Product Details
<a href=http://www.personal.psu.edu/sxh36/appendixa.htm>Appendix A</a>