Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA...

30
Deep-Indexing the OPAC: Deep-Indexing the OPAC: Integrating Contents Information Integrating Contents Information into Search Results into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October 2005.

Transcript of Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA...

Page 1: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Deep-Indexing the OPAC: Deep-Indexing the OPAC: Integrating Contents Integrating Contents

Information into Search Information into Search ResultsResults

Mary M. Strouse, CUA DuFour Law Library7th MAIUG Annual Conference, October 2005.

Page 2: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Functions of Contents Functions of Contents Information (TOC Data)Information (TOC Data)

• Evaluation:Does this resource suit my purpose?

• Navigation:Which volume(s), pages do I need?

• Identification/Collocation:

What does the library have about…?

…Written by…?

… Containing… [known title]?

Page 3: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Local Priorities for TOC Local Priorities for TOC InclusionInclusion

• Edited collections on broad themes

• Diverse geographic treatments

• Conferences, symposia, anthologies

• Local interest (our faculty, etc.)

Unifying themes: multiple authorship and/or non-predictable content

Page 4: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Where, When and HowWhere, When and How• Choice of vendors

• Blackwell• Syndetic Solutions• Marcive (using Syndetic’s data)• Scanning/local input

• Loading mechanisms• III loading services (Blackwell)• Matching on demand

• Choice of formats

Page 5: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

• Standards-compliant• Existing tools (macros, spell-check) • Includes volume and chapter, but not

page #• Keyword access to titles and authors • Titles indexed (all or nothing)• Can’t index authors (not in inverted form).• Can be difficult to read

505 Enhanced Contents 505 Enhanced Contents NoteNote

Page 6: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Vendor TOC Format Vendor TOC Format (97x)(97x)

• Displays as a table of contents • Includes page numbers• Indexing flexibility: authors and titles as

well as keyword • Can exclude generic titles

(“Introduction” “Preface”, etc.) from indexing

• Space for both transcribed and authorized forms

Page 7: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

97x TOC Format Detail97x TOC Format Detail 970 field (one per chapter or section title)

Indicator 1 : title indexing0 Non-distinctive title (don’t index*)1 General chapter title or heading 2 Citable title No longer used

Indicator 2: hierarchy, degree of indentation |l: Section or chapter label |t: Section or chapter title |c: Personal author |f: Personal author in inverted form *|d: Non-personal author|e: Editor |p: Starting page number

*by default, authors of non-distinctive titles are not indexed. May be indexed on request.

Page 8: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Disadvantages of TOC Disadvantages of TOC Format:Format:

• Table-like format takes up screen, adds significantly to printing

• Limitations on use of vendor data (TOC blocked in exported results)

• Implicit burden on library staff • Commingling of library and vendor data

Page 9: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Display bug: Corporate author doesn’t display

Page 10: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Effect on Keyword SearchEffect on Keyword Search

• Adds significantly to retrieval in keyword searches – authors and titles

• Display element in keyword results is always the book title – also true of sorted/limited results.

• Search terms are highlighted in full record display

Page 11: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Effect on Title SearchEffect on Title Search

• Library determines which titles to exclude (970 first indicator)

• Chapter titles will appear in unsorted results browse

• Chapter titles not identified as such• English initial articles automatically

excluded• Search terms highlighted in full record

display

Page 12: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Effect on Author SearchEffect on Author Search

• Individual authors are linked in record (as transcribed) and appear in browse list (indexed form)

• Authority work often needed to match with existing names

• Corporate authors from 970 |d and editors from 970 |e not indexed

• Display of titles in extended browse follows same rules as title search

Page 13: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Keyword-only Indexing Keyword-only Indexing OptionOption

• Includes authors and titlesMust specify inclusion in author and

title segments • Avoids collocation issue/authority work • Avoids “noise” retrieval, confusion

between chapters and books• Limits access to documents and reports

(distinct works)• Limits effectiveness of known-author

and known-title searches

Page 14: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Formatting ControlsFormatting Controls

• BIB_TOC_HEADER WWWoption• Places a caption at head of TOC display• Default: no caption• Accepts HTML for formatting or link to a

help file• TABLEPARAM_BIB_TOC • Stylesheet Class : bibTOC• No link in brief citation

Page 15: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Search result display Search result display optionsoptions

• DISPLAY_245= does not apply to chapter titles

• EXTENDED_T=U will not force a book title to display

• Beware confusion from forcing extended display (INDEX_EXT=ta)

Page 16: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

BROWSE WWWoptionBROWSE WWWoption

• Controls first line of index browse • BROWSE_T= controls first line of record

browse (in absence of briefcit.htm )• If no 970 subfields are specified, all

subfields will display• If specify default subfields for non-245

titles, must include subfield t

Page 17: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Example 1: BROWSE_T=245/abnp/c or BROWSE_T=245/abnp/c |970/t/cdBROWSE_T=245/abnp/c |970/t/cd|/a/c|/a/c

(ALL TOC subfields display)(ALL TOC subfields display)

Page 18: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Example 2: BROWSE_T=245/abnp/c |970/t/cd andBROWSE_T=245/abnp/c |970/t/cd |/at/c

Page 19: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Briefcit FormatBriefcit Format

<span class="briefcitTitle"><!--{linkfieldspec:VbT}--></span>

• All record browse screens show book titles (includes limited and keyword results)

• All index browse screens show chapter titles (includes sorted results)

• Use BROWSE_T= (define 970 |t to avoid “no title” display)

Page 20: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Briefcit FormatBriefcit Format

<span class="briefcitTitle"><!--{linkfieldspec:Vbt245abnp}--></span>

• All record browse screens show book titles

(includes sorted, limited and keyword)• Only system-sorted index browse shows

chapter titles

Page 21: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Loading and Workflow Loading and Workflow IssuesIssues

• False adds : monographic series w/ ISSNs, • False drops: CIP and other title

discrepancies• Coding consistency• Authority control

• Volume of work• Lack of tools• No mechanism to identify/protect library-

added data

Page 22: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Coding issues: vendor- Coding issues: vendor- supplied datasupplied data

• Titles and names transcribed from TOC, not from fullest form available

• No space for formal titles of included works -- we add 7xx

• Inconsistent coding of index-worthiness • Is “Appendix” a title or a number?

Page 23: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

““Non-personal” Authors: Non-personal” Authors: |d|d

• Used for corporate author:

970 11 |l9 |tRedefining Discrimination: 'Disparate Impact' and the Institutionalization of Affirmative Action |d United States Department of Justice Office of Legal Policy |p121

• Also used for personal authors in direct order (but sometimes not):970 12 |tExcerpts from Antigone |d Sophocles |

p11 970 11 |tReith Lecture 2000 |d The Prince of

Wales |p11

Page 24: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

““Non-personal” Non-personal” Authors: |dAuthors: |d

• Also used for other transcribed phrases and “et al.”:

970 21 |tWorkshop Discussion: Civil Litigation Against Terrorism |d Workshop Participants |p185

970 21 |tPublic Support for Access to Government Records: A National Survey |cPaul D. Driscoll| fDriscoll, Paul D. |cSigman L. Splichal |fSplichal, Sigman L. |cMichael B. Salwen|fSalwen, Michael B. |d [et al.] |p23

• Library can add index link in |f (not vendor-provided)

Page 25: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Recap: User IssuesRecap: User Issues

• Cost in screen space, added printing• Multiple forms of author entry (split

files)• Can’t distinguish between chapter and

book-length treatment (increased noise)

• License limitations on data use

Page 26: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Wish ListWish List

• Fix corporate author display bug • Identify chapter titles in search results• Option to force display of both chapter

title and book title in extended browse • Link to full TOC display from brief

citation format (briefcit.html)• Allow limited data export for legitimate

scholarly use

Page 27: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Recap: Workflow Recap: Workflow IssuesIssues

• Vendor-dependent format

• Staff burden – need coding regularization

• Co-mingling of vendor and library data

• False positives (multiple ISBN)

Page 28: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

Wish ListWish List

• Additional Subfields/Codes:• Indexed/authorized form for corporate

author• Data source and ownership • Authority history

• Subfield code(s) to identify library-added TOC data, overlay-protect library-added authority work

• Better coding conventions, transparency

Page 29: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

ReferencesReferences

• CSDirect TOC Data FAQ (password required)http://csdirect.iii.com/faq/tocfaq.shtml

• Blackwell TOC Enrichment brochurehttp://www.blackwell.com/pdf/TOCEnrichment.pdf

• Vendorshttp://www.blackwell.com/level2/TOC.asphttp://www.syndetics.com/index.htmhttp://www.marcive.com/HOMEPAGE/MARCres.htm(Marcive uses Syndetics data)

Page 30: Deep-Indexing the OPAC: Integrating Contents Information into Search Results Mary M. Strouse, CUA DuFour Law Library 7 th MAIUG Annual Conference, October.

ContactContact

Mary M. StrouseHead of Technical ServicesJudge Kathryn J. DuFour Law

LibraryCatholic University of Americastrouse at law.cua.edu