NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO...

22
NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services [email protected]

Transcript of NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO...

Page 1: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS

Oliver Pesch

EBSCO Information Services

[email protected]

Page 2: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Overview• Premise for IOTA completeness score and element weights• Proving the theory through real-life tests• Using statistical approach to determine weights• Test results• Conclusions• Next steps for IOTA

Page 3: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

The premise behind IOTA• Completeness Score is the measure of the

“completeness” of a single OpenURL• Completeness Index is attributed to the content provider

as an overall measure of the completeness of their OpenURLs

Page 4: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

The premise behind IOTA• The Completeness Score is calculated by “weighing” the

elements provided in the OpenURL based on their importance in target links

• Some elements are more important than others and will have a higher weight

• Completeness Score equals the sum of weights of elements found divided by the maximum score possible

Page 5: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

The premise behind IOTA• Simple example assuming equal element weights

Element Description Weight This OpenURL

ATitle Article title 1

AuLast Author’s last name 1

Date Date of publication 1

ISSN ISSN 1

Issue Issue number 1

SPage Start page 1

Title Journal Title 1

Volume Volume number 1

TOTAL 8

Page 6: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

The premise behind IOTA• Simple example assuming equal element weights

Element Description Weight This OpenURL

ATitle Article title 1

AuLast Author’s last name 1

Date Date of publication 1

ISSN ISSN 1

Issue Issue number 1

SPage Start page 1

Title Journal Title 1

Volume Volume number 1

TOTAL 8

SAMPLE OPEN URL DATA ?date=2/4/2008 &issn=1083-3013 &volume=13 &issue=20 &atitle=the+casualties+of+war

1

1

1

1

1

5

Completeness Score...

(Total for This OpenURL)Total Weights

5 / 8

= .625

Page 7: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Determining the weights• Initial approach

• Frequency of element occurrence in target link templates• Combined with reasoning

Page 8: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Initial WeightsOpenURL data element Description Weight

ATitle Article title 1

AuLast Author’s last name 1

Date Date of publication 5

eISSN Online ISSN 3

ISSN Print ISSN 3

Issue Issue number 3

Jtitle Journal Title 1

Pmid PubMed ID 8

SPage Start page 3

Title Journal Title 1

Volume Volume number 3

DOI Digital Object Identifier 8

Page 9: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Initial WeightsOpenURL data element Description Weight

ATitle Article title 1

AuLast Author’s last name 1

Date Date of publication 5

eISSN Online ISSN 3

ISSN Print ISSN 3

Issue Issue number 3

Jtitle Journal Title 1

Pmid PubMed ID 8

SPage Start page 3

Title Journal Title 1

Volume Volume number 3

DOI Digital Object Identifier 8

Initial weights were somewhat subjective.

Page 10: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Initial WeightsOpenURL data element Description Weight

ATitle Article title 1

AuLast Author’s last name 1

Date Date of publication 5

eISSN Online ISSN 3

ISSN Print ISSN 3

Issue Issue number 3

Jtitle Journal Title 1

Pmid PubMed ID 8

SPage Start page 3

Title Journal Title 1

Volume Volume number 3

DOI Digital Object Identifier 8

Most link resolver knowledge bases can

handle look-ups by either Print ISSN or Online ISSN

(both are not needed)

Page 11: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Initial WeightsOpenURL data element Description Weight

ATitle Article title 1

AuLast Author’s last name 1

Date Date of publication 5

eISSN Online ISSN 3

ISSN Print ISSN 3

Issue Issue number 3

Jtitle Journal Title 1

Pmid PubMed ID 8

SPage Start page 3

Title Journal Title 1

Volume Volume number 3

DOI Digital Object Identifier 8

Most link resolvers will enhance identifiers like PubMed ID and DOI; therefore, having an

identifier is like having all metadata elements.

Page 12: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Validating the Completeness Score• Use real OpenURLs and a commercial link resolver.

(tested with LinkSource and 360-Link)• Remove institutional holdings as a limit to resolution

• Process each OpenURL through the link resolver to determine “Success”• Score one point for finding at least one full text target

• Calculate the completeness score for each OpenURL• Look for a statistical correlation between the

completeness score and the success score

Page 13: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Results: Original Weights

Correlation Coefficient .43Tests conducted on sample of 15,000 OpenURLs randomly pulled from IOTA database

0.0000

0.2000

0.4000

0.6000

0.8000

1.0000

1.2000

Average of Completeness Score

Average of Success Score

Page 14: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

A Statistical Approach to Determining Element Weights• Select a set of “perfect” OpenURLs

• include all key data elements and resolve to full text

• Perform step-wise regression• Test failure rates for each element by removing that element

• Use failure rates as basis for weights• Use new weights to test for correlation between weights

and success for larger sample

Page 15: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Failure Rates from 1500 OpenURL test sample

Element removed from the OpenURL

Description Failure Percentage

ATitle Article title .74%

AuLast Author’s last name .07%

Date Date of publication .4%

ISSN ISSN (either online or print ISSN)

22.02%

Issue Issue number 20.27%

SPage Start page 33.27%

Title Journal Title (either Title or Jtitle)

.61%

Volume Volume number 74.14%

Volume is most critical

Author’s last name is least important

Date is surprisingly low

Page 16: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Calculated Element WeightsElement Description Weight*

ATitle Article title 1.87

AuLast Author’s last name 0.83

Date Date of publication 1.61

ISSN ISSN (either online or print ISSN)

3.34

Issue Issue number 3.31

SPage Start page 3.52

Title Journal Title (either Title or Jtitle)

1.78

Volume Volume number 3.87

*Element weight calculation: log10 (failure-rate-per-10,000 OpenURLs)

Page 17: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Results: New Weights

Correlation Coefficient .80Tests conducted on sample of 15,000 OpenURLs randomly pulled from IOTA database

0.0000

0.2000

0.4000

0.6000

0.8000

1.0000

1.2000

Average of Complete-ness Score

Average of Success Score

Page 18: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Notes

Testing the same OpenURLs on 360-Link results in different numbers but consistent trends. Differences may be attributed to:

• Variations in metadata enhancement techniques• Strictness in target link rules (e.g. required elements before link

shows – tied to level of forgiveness of target)• Link syntax used for target

Page 19: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Notes

96.3 of OpenURLs in the test were able to populate a full text target of credible ILL form…

• Perception of high failure rate of OpenURL may be attributed to library holdings and user expectations

• Suggestion: set link text to control expectations• Link to full text (for items in the online collection)• Check library collection (for things in print collection)• Request from library (for everything else)

Page 20: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Conclusions• Step-wise regression approach to element weights works• Completeness Index scores can be correlated to actual

OpenURL “success”• KB and resolver technology influence results and prevent

a universal set of element weights

The Completeness Index is a mechanism individual link resolver vendors can use to provide

metrics to help improve their service quality

Page 21: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

Other takeawaysSeveral factors involved in perceived “link failure”:1. Bad or missing metadata in the OpenURL link2. Inaccurate holdings data within the resolver’s knowledge base3. Flexibility of syntax to the target

- e.g., target supports at least two: OpenURL syntax, DOI link, proprietary link structure

4. Flexibility of resolution logic at the target- i.e., target finds way to create link using available data when some data missing or wrong

5. User expectations- e. g., link resolver provided link to OPAC or ILL form, but user was expecting full text

- IOTA focused on (1)- KBART working on (2)- Education of content providers could address (4)- Displaying OpenURL button only if full text available could address (5)

Page 22: NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com.

What’s next for IOTA• Continue offering public access to reports on element

frequency• Publish technical report on work to date• Publish recommended practice for calculation and use of

completeness scores for link quality assessment by link resolver vendors

• Continue work as a NISO standing committee for at least one more year

THANK YOU!