EXtensible Catalog Software Portfolio Ben Anderson, Software Engineer, XCO.

Click here to load reader

  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    4

Embed Size (px)

Transcript of EXtensible Catalog Software Portfolio Ben Anderson, Software Engineer, XCO.

  • Slide 1
  • eXtensible Catalog Software Portfolio Ben Anderson, Software Engineer, XCO
  • Slide 2
  • eXtensible Catalog is 2 eXtensible Catalog (XC) is open source, user-centered, next generation software for libraries. XC provides a discovery interface and a set of tools for libraries to manage metadata and build applications.
  • Slide 3
  • Software Overview User Interface: Faceted, FRBRized, customizable search interface Web application framework for libraries Metadata Tools: Automated metadata processing: Enable libraries to aggregate metadata and run services on it XC Schema New XML schema with Dublin Core terms, RDA elements and roles, MARC vocabularies, and XC elements FRBR levels +: Work, Expression, Manifestation, Holdings, Item Connectivity Tools: Harvest and synchronize metadata with OAI-PMH Circulation and authentication with NCIP 3 XC Metadata Services Toolkit Metadata Tools Drupal CMS XC Drupal Toolkit User Interface ILS XC OAI Toolkit XC NCIP Toolkit Connectivity Tools Metadata Application Profile XC Schema
  • Slide 4
  • Partners and Contributors University of Rochester The Andrew W. Mellon Foundation Consortium of Academic and Research Libraries in Illinois (CARLI) University of Notre Dame Rochester Institute of Technology Kyushu University working with NTT-Data University of North Carolina at Charlotte Serials Solutions OCLC University at Buffalo Cornell University Yale University Ohio State University Nylink 4
  • Slide 5
  • Partners and Contributors 5
  • Slide 6
  • Note from Takanori HAYASHI Hi, folks, As all know, great earthquake and tsunami attack East Japan Area (Also my library). Many libraries has earthquake destruct. Now, Code4Lib Japan members and other librarians manage and distribute of destruct information about any libraries. http://www45.atwiki.jp/savelibrary/pages/1.html NHK is broadcasting Japanese information by Ustream http://www.ustream.tv/channel/nhk-world-tv Google provide Crisis response service, if your friend in Japan status is unknown, I recommend use Person Finder and Disaster message boards provide by telephone careers. http://www.google.co.jp/intl/en/crisisresponse/japanquake2011.html And send response to Japanese librarian by twitter, using hachtag #jishinlib Thank you for very kindness message from all member of c4l by twitter or other, and sending rescue team form USA and other countries. Please don't worry, hacker is never die :-) -- Takanori HAYASHI Agriculture, Forestry and Fisheries Research Information Technology Center http://www45.atwiki.jp/savelibrary/pages/1.html http://www.ustream.tv/channel/nhk-world-tv http://www.google.co.jp/intl/en/crisisresponse/japanquake2011.html 6
  • Slide 7
  • eXtensible Catalog User Interface XC Drupal Toolkit
  • Slide 8
  • Discovery interface and library web application platform in one Faceted, FRBRized search of XC Schema metadata Extensive, easy customization Established open source community Data-driven web applicatons with web forms 8 Drupal CMS XC Drupal Toolkit User Interface
  • Slide 9
  • 9
  • Slide 10
  • 10
  • Slide 11
  • 11
  • Slide 12
  • 12
  • Slide 13
  • 13
  • Slide 14
  • 14
  • Slide 15
  • 15
  • Slide 16
  • 16 The XC Schema combines metadata fields from multiple standard schemas (RDA and DC) plus adds new XC schema elements.
  • Slide 17
  • 17
  • Slide 18
  • XC Schema Fields New XML metadata schema Dublin Core terms RDA - 22 elements and 11 roles designators XC elements (contain MARC vocabularies, linking fields, etc) Subset of RDA chosen to retain the granularity in current MARC data: Frequency Numbering of Serials Coordinates of Cartographic Content Plate number (music) 18 DCMI RDA XC
  • Slide 19
  • XC Schema Record Types FRBR Group 1 Entities + Work Expression Manifestation Holdings Item 19
  • Slide 20
  • Creating XC Schema data from MARC 20 MARCXML Bibliographic XC Work XC Expression XC Manifestation XC Holdings Parse MARCXML records into linked FRBR-based records MARC Holdings records produce XC Holdings records (to preserve MARC granularity) All XC records have globally unique identifiers, and a permanent host repository Uplinks created MARCXML Holdings OO4 Uplink Manifestation Held Expression Manifested Work Expressed
  • Slide 21
  • 21
  • Slide 22
  • 22
  • Slide 23
  • 23
  • Slide 24
  • 24
  • Slide 25
  • 25
  • Slide 26
  • 26
  • Slide 27
  • 27
  • Slide 28
  • 28
  • Slide 29
  • 29
  • Slide 30
  • 30
  • Slide 31
  • 31
  • Slide 32
  • 32
  • Slide 33
  • 33
  • Slide 34
  • 34
  • Slide 35
  • 35
  • Slide 36
  • 36
  • Slide 37
  • 37
  • Slide 38
  • 38
  • Slide 39
  • 39
  • Slide 40
  • 40
  • Slide 41
  • 41
  • Slide 42
  • 42
  • Slide 43
  • 43
  • Slide 44
  • 44
  • Slide 45
  • 45
  • Slide 46
  • 46
  • Slide 47
  • 47
  • Slide 48
  • 48
  • Slide 49
  • 49
  • Slide 50
  • 50
  • Slide 51
  • 51
  • Slide 52
  • 52
  • Slide 53
  • 53
  • Slide 54
  • 54
  • Slide 55
  • 55
  • Slide 56
  • 56
  • Slide 57
  • 57
  • Slide 58
  • 58
  • Slide 59
  • 59
  • Slide 60
  • 60
  • Slide 61
  • 61
  • Slide 62
  • 62
  • Slide 63
  • 63
  • Slide 64
  • 64
  • Slide 65
  • 65
  • Slide 66
  • 66
  • Slide 67
  • 67
  • Slide 68
  • 68
  • Slide 69
  • 69
  • Slide 70
  • 70
  • Slide 71
  • 71
  • Slide 72
  • 72
  • Slide 73
  • 73
  • Slide 74
  • 74
  • Slide 75
  • 75
  • Slide 76
  • 76
  • Slide 77
  • 77
  • Slide 78
  • 78
  • Slide 79
  • 79
  • Slide 80
  • 80
  • Slide 81
  • 81
  • Slide 82
  • 82
  • Slide 83
  • 83
  • Slide 84
  • 84
  • Slide 85
  • 85
  • Slide 86
  • 86
  • Slide 87
  • 87
  • Slide 88
  • 88
  • Slide 89
  • 89
  • Slide 90
  • 90
  • Slide 91
  • 91
  • Slide 92
  • 92
  • Slide 93
  • 93
  • Slide 94
  • 94
  • Slide 95
  • 95
  • Slide 96
  • 96
  • Slide 97
  • 97
  • Slide 98
  • New Default Theme 98
  • Slide 99
  • 99
  • Slide 100
  • 100
  • Slide 101
  • 101
  • Slide 102
  • Themes: Kyushu University Library 102
  • Slide 103
  • Kyushu - Search results in Japanese Reasons why these items are shown Query : America Japan Translated : Faceted navigation 103
  • Slide 104
  • Anyone in WNYLRC using Drupal? 104
  • Slide 105
  • YES! Cattaraugus- Allegany-Erie-Wyoming BOCES, SLS Jamestown Community College (Jamestown) Jamestown Community College (Olean) Roswell Park Cancer Institute Corp. +others that are cloaking or using drupal elsewhere besides the main web site 105
  • Slide 106
  • Metadata Metadata Issues and XC Metadata Management Tools
  • Slide 107
  • Metadata Issues: Silos Users have many starting points for search because all the data is not available in a single system: Integrated Library Systems Institutional Repositories Webpages Subscription Databases Libraries dont have good options for searching across all of these sources Usability Silo Quality Format Silo
  • Slide 108
  • Metadata Issues: Quality ILS MARC export issues Cataloging errors and variant practices End-user generated metadata Lack of authority control Libraries dont have good options for making use of data at a range of quality levels Usability Silo Quality Format Quality
  • Slide 109
  • Metadata Issues: Format MARC format is everywhere but does not support current metadata needs Multiple formats are useful to describe a range of resources, but difficult to search across consistently Libraries dont have good options to try out new standards like RDA Usability Silo Quality Format Format
  • Slide 110
  • Metadata Issues: Usability ILS OPAC interfaces are deficient in: Ease of learning / ease of use Precision and recall Finding similar and related resources Usability Silo Quality Format Usability
  • Slide 111
  • Other XC records M M W M M E M M M 5. Index4. Aggregate3. Transform Following one MARC record through XC Steps: 1.Convert from raw MARC to MARCXML (minor cleanup) 2.Normalize MARCXML (major cleanup) 3.Transform from MARCXML to XC (FRBRize) 4.Aggregate at each FRBR level (match and merge) 5.Index records / create WEMs (one for each unique Manifestation) 111 MARC MARCXML (dirty) MARCXML (clean) W E M XC 2. Normalize1. Convert WEM Index Data is ready for search and faceted browse XC merge W E M match ? ? ? 5. Index4. Aggregate3. Transform2. Normalize1. Convert
  • Slide 112
  • Drupal CMS XC Metadata Services Toolkit XC Software Components ILS 112 XC OAI Toolkit 5. Index4. Aggregate3. Transform2. Normalize1. Convert MARCXML Normalization MARCXML Normalization MARCXML to XC Transformation MARCXML to XC Transformation XC Aggregation XC Drupal Toolkit Usability Silo Quality Format Usability Silo Quality Format Usability Silo Quality Format Quality Format Usability 5. Index4. Aggregate3. Transform2. Normalize1. Convert Format Silo Quality Metadata Issue Handling Usability Silo
  • Slide 113
  • XC Software Components ILS 113 1. Convert Usability Silo Quality Format 1. Convert Format Silo Quality XC OAI Toolkit Expose ILS metadata to XCs next generation catalog interface and metadata tools Synchronize ongoing changes in ILS records with XC software automatically Convert raw MARC into MARCXML Address data and identifier issues Compatible with most ILSs XC OAI Toolkit
  • Slide 114
  • XC Metadata Services Toolkit New type of staff client for processing large batches of metadata through an orchestrated set of services. Harvest from multiple sources (silos) to address format and quality issues. Aggregate and de-dupe metadata. Automatic synchronization propagates changes in source metadata through services and on to discovery interface. XC Metadata Services Toolkit XC Software Components 114 4. Aggregate3. Transform2. Normalize MARCXML Normalization MARCXML Normalization MARCXML to XC Transformation MARCXML to XC Transformation XC Aggregation
  • Slide 115
  • XC Metadata Services Toolkit XC Software Components 115 4. Aggregate3. Transform2. Normalize MARCXML Normalization MARCXML Normalization Usability Silo Quality Format Usability Quality MARCXML Normalization Service Transform language codes to spelled-out languages: e.g. fre becomes French Normalize forms of OCLC numbers so that they are all the same Substitute vocabulary terms for format/type of material codes in the MARC record (Leader, 006, 007, 008) to enable building facet values Substitute codes for audience level (juvenile, etc.) and type of material (fiction, non-fiction; identifies dissertation/thesis) Deconstructs LC Subject headings so we can map parts of them to different facets: geographic, genre, topic, etc.
  • Slide 116
  • XC Metadata Services Toolkit XC Software Components 116 4. Aggregate3. Transform2. Normalize MARCXML to XC Transformation MARCXML to XC Transformation Usability Silo Quality Format Usability 3. Transform MARCXML to XC Transformation Service Parse flat MARC records to create linked FRBR-based records (work, expression, etc.) in XC Schema One input record results in several output records Manage relationships between records, including one to many relationships Creates multiple work and expression records for analytics Handles bound-withs (e.g. two books bound together)
  • Slide 117
  • XC Metadata Services Toolkit XC Software Components 117 4. Aggregate3. Transform2. Normalize XC Aggregation Usability Silo Quality Format Usability 4. Aggregate XC Aggregation Service Aggregate records that represent the same resource at: Manifestation-level Work-level (depends on Authority service) Manage relationships between records (FRBR entities, etc.) Enable automated synchronization of updates for records at each FRBR level Sets stage for future non-MARC RDA implementation
  • Slide 118
  • XC Metadata Services Toolkit Drupal CMS eXtensible Metadata Services ILS 118 XC OAI Toolkit 5. Index4. Aggregate3. Transform2. Normalize1. Convert MARCXML Normalization MARCXML Normalization MARCXML to XC Transformation MARCXML to XC Transformation XC Aggregation XC Drupal Toolkit 5. Index4. Aggregate3. Transform2. Normalize1. Convert DC / Qualified DC Normalization DC / Qualified DC Normalization DC to XC Transformation DC to XC Transformation MARCXML / XC Authority DSpace Normalization to Transformation XC to RDF (Linked data out) XC to RDF (Linked data out)
  • Slide 119
  • XC Drupal Toolkit Adds support for library metadata into Drupal (DC and XC schemas) Apache SOLR index of WEMs to enable faceted, FRBRized results navigation Single search interface across: Library catalog Digital repository Website resources Extensive customization Integration with ILS circulation system (via XC NCIP Toolkit) Drupal CMS XC Software Components 119 5. Index XC Drupal Toolkit Usability Silo Quality Format Usability 5. Index
  • Slide 120
  • eXtensible Catalog Software Portfolio Metadata Services Toolkit
  • Slide 121
  • Metadata Services Toolkit (MST) Tasks Get metadata into the MST Add Repositories Schedule Harvests Tell MST what to do with metadata Install metadata services Add processing rules Verify results / troubleshoot processing Browse records View error logs 121 XC Metadata Services Toolkit Metadata Tools
  • Slide 122
  • MST: Add Repository 122 Telling the MST about a repository is easy. Assign a name of your choice and enter the URL. After adding a repository, the MST will automatically do a handshake with it and provide Success or Error messages for each step in the handshake. When successful, the MST reports on what formats and sets are available in the remote database. The MST supports all XML schemas, but individual services are schema-specific. When successful, the MST reports on what formats and sets are available in the remote database. The MST supports all XML schemas, but individual services are schema-specific.
  • Slide 123
  • MST: Schedule Harvest 123 The next step is to schedule the harvesting of metadata from the remote repository. Options - Set the schedule - Choose start and end dates - Select sets and formats Options - Set the schedule - Choose start and end dates - Select sets and formats
  • Slide 124
  • MST: Install A Service 124 The next step is to install a metadata service. A service is a separate program, written in Java, that is managed by the MST. Services can be downloaded from the XC website or you can write your own by following the developers manual. The next step is to install a metadata service. A service is a separate program, written in Java, that is managed by the MST. Services can be downloaded from the XC website or you can write your own by following the developers manual. In order to use a service, you place the downloaded file in a directory by following the MST manual. This screen can then be used to install the service in the MST. In order to use a service, you place the downloaded file in a directory by following the MST manual. This screen can then be used to install the service in the MST.
  • Slide 125
  • MST: Add A Processing Rule 125 This example shows two services already installed in this Metadata Services Toolkit (MST): MARC Normalization and MARC-to-XC-Transformation. Now we need to tell the MST which metadata records we want proccessed through which services, and in what order. This is called service orchestration. We will now add a Processing Rule Now we need to tell the MST which metadata records we want proccessed through which services, and in what order. This is called service orchestration. We will now add a Processing Rule
  • Slide 126
  • MST: Browse Records 126 Browse Records is a feature of the MST that includes faceted browse and full-text search. The MST has a local copy of all harvested metadata and all metadata produced by each installed service. Browse Records is a feature of the MST that includes faceted browse and full-text search. The MST has a local copy of all harvested metadata and all metadata produced by each installed service.
  • Slide 127
  • MST: Browse Records 127 Library staff use Browse Records to verify that services are functioning properly and to debug any issues.
  • Slide 128
  • MST: Browse Records 128 Navigation to full record display (MST handles display of any XML schema) Whenever a record is processed by a service, the original record is preserved and one or more new records may be produced. These records are called successors. Navigation links take you to predecessor and successor records. In this case, links connect MARC records to their normalized successor. In another case, links connect a normalized MARC record to its successor Work, Expression and Manifestation records. Whenever a record is processed by a service, the original record is preserved and one or more new records may be produced. These records are called successors. Navigation links take you to predecessor and successor records. In this case, links connect MARC records to their normalized successor. In another case, links connect a normalized MARC record to its successor Work, Expression and Manifestation records.
  • Slide 129
  • MST: Browse Records 129 Each service can register error messages with the MST upon installation. In this example the MARC Normalization service has attached errors to specific records. Errors are facets in the MST. The i icon links to a customizable webpage with instructions for staff to address the error.
  • Slide 130
  • MST: View Full Record 130 Full Record Display: MARC Holdings Record Administrative metadata managed by the MST Predecessor and successor links XML viewer (supports any XML schema) Full Record Display: MARC Holdings Record Administrative metadata managed by the MST Predecessor and successor links XML viewer (supports any XML schema)
  • Slide 131
  • MST: View Error Logs 131 Log file management system with navigation. This page shows MST system log files. Each installed service as well as harvest-in and harvest-out logs are available.
  • Slide 132
  • eXtensible Catalog Whats Next Vision for Linked Data
  • Slide 133
  • Semantic Web and Linked Data The Semantic Web refers to a set of technologies that allow computers to understand the meaning of information on the web Linked data is a mechanism for exposing, sharing and connecting data on the web 133
  • Slide 134
  • Semantic Web and Linked Data If everything has a unique identifier, then information from one website can be related to information from another via a computer program Everything includes people, places, things, vocabularies, metadata elements, web documents, 134
  • Slide 135
  • Semantic Web and Linked Data A Uniform Resource Identifier (URI) is a string of characters used to identify a name (URN) or an resource on the internet (URL). Two kinds of resources information resources traditional web things like documents, images, etc. non-information resources these are real world objects like people, physical products, places, concepts, proteins, etc 135
  • Slide 136
  • Turning information into Linked Data 136 Predicate (URI from a defined vocabulary) Subject (URI)Object (URI or literal) RDF Triple defined: Information that might be on a webpage, but cannot be readily understood by a computer: David Lindahl is 40 years old. Example #1: Describe something foaf:age David Lindahl40 Step 1: Parse it into a Subject, Predicate, and Object: http://xc.org/resource/dlindahl 40 http://xmlns.com/foaf/spec/#term_age Step 2: Convert to URIs:
  • Slide 137
  • Turning information into Linked Data 137 Predicate (URI from a defined vocabulary) Subject (URI)Object (URI or literal) RDF Triple defined: Information that might be on a webpage, but cannot be readily understood by a computer: David Lindahl knows Jennifer Bowen. Example #2: Define a relationship foaf:knows David LindahlJennifer Bowen Step 1: Parse it into a Subject, Predicate, and Object: http://xc.org/resource/dlindahlhttp://xc.org/resource/jbowen http://xmlns.com/foaf/spec/#term_knows Step 2: Convert to URIs:
  • Slide 138
  • Linking Open Data Cloud 138
  • Slide 139
  • Linked Data on XC XC Metadata Services Toolkit (MST): Converts multiple formats into XC Schema XC Schema is linked data ready XC Schema uses defined vocabularies (rda, dcterms, xc) Persistent OAI-PMH (web services) data repository Plug-in service architecture can be extended support RDF technologies 139
  • Slide 140
  • Download XC software at eXtensibleCatalog.org