Walk Before You Run: Prerequisites to Linked Data
-
Upload
kenning-arlitsch -
Category
Data & Analytics
-
view
18 -
download
0
Transcript of Walk Before You Run: Prerequisites to Linked Data
![Page 1: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/1.jpg)
Walk Before You Run
Prerequisites to Linked DataKenning Arlitsch
Dean of the Library@kenning_msu
![Page 2: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/2.jpg)
Linked Data applications will not matter if search engines can’t find library websites and repositories, crawl them, and understand the metadata provided.
First, Take Care of Basics
![Page 3: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/3.jpg)
AgendaTraditional SEO (Search Engine Optimization)– Hardware, software, websites, metadata
Semantic Web Optimization– Semantic Identity– Schema.org Project at MSU• Using a vocabulary understood by search engines• Improve machine comprehension
![Page 4: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/4.jpg)
Funded Research• 2011-2014
– “Getting Found: Search Engine Optimization for Digital Repositories”• 2014-2017
– “Measuring Up: Assessing Accuracy of Reported Use and Impact of Digital Repositories
– Partners• OCLC Research• Association of Research Libraries• University of New Mexico
![Page 5: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/5.jpg)
SEARCH ENGINE OPTIMIZATIONPart 1 of 3
![Page 6: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/6.jpg)
SEO Building Blocks• Priority 1 – Increase Reach– Get objects indexed by search engines
• Priority 2 – Increase Visibility in SERP– Provide robust descriptive content
• Priority 3 – Get Relevant– Increase click-through rates (CTR)
![Page 7: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/7.jpg)
Why it Matters
![Page 8: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/8.jpg)
DeRosa, Cathy, et al. “Perceptions of Libraries, 2010: Context and Community: A
Report to the OCLC Membership”, OCLC, 2010.
Where College Students Begin Research - 2010
![Page 9: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/9.jpg)
* http://www.comscore.com/Insights/Market-Rankings/comScore-Releases-November-2014-U.S.-Desktop-Search-Engine-Rankings
Americans submit 18 billion search queries to search engines each month*• 12 billion to Google sites (67%)• 3.5 billion to Microsoft sites (19%)• 1.8 billion to Yahoo! Sites (10%)
How much of that traffic is directed to our libraries?
Need more reasons?
![Page 10: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/10.jpg)
Our Research Inspiration• Decade building digital libraries - Univ of Utah– Mountain West Digital Library– Utah Digital Newspapers– Western Waters Digital Library– Western Soundscape Archive
• Were they being used…?
![Page 11: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/11.jpg)
Uh, not really…
• 2010 situation at Utah– 12% of digital collections indexed by Google– 0.5% of Utah’s IR scholarly papers accessible via
Google Scholar
![Page 12: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/12.jpg)
Basic SEO began producing significant increases in the average number of page views per day…
Avg. Page Views / Day content.lib.utah.edu
![Page 13: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/13.jpg)
Basic SEO improved Utah’s collection accessibility in Google…
Average
0% 25% 50% 75% 100%
92%
79%
51%
12%
07/05/10 04/04/11 11/30/11 12/05/13
Google Index Ratio - All Collections*
* Google Index Ratio = URLs submitted / URLs Indexed by Google** ~150 collections containing ~170,00 URLs (07/2010) and ~170 collections containing ~282,000 URLs (12/2013)
![Page 14: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/14.jpg)
…resulting in more referrals and visitors
12 week comparison 2010 vs. 2012
![Page 15: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/15.jpg)
Technical Barriers to SE Crawlers• Website Design
– Graphics– Confusing site hierarchies and paths
• Slow servers• CMS often lack canonical links• Metadata
– Schema not understood by SE– Not unique– Inconsistent/inaccurate
![Page 16: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/16.jpg)
Nearly 100% USpace IR content indexed in Google
Google Index Ratio
Board of Regents
UScholar Works
ETD 2
ETD 1
0% 25% 50% 75% 100%
97%
98%
98%
97%
47%
51%
68%
69%
4%
23%
0%
12%07/05/1011/19/1010/16/11
Google Scholar Index Ratio
~0%*October 16, 2011 Weighted Average Google Index Ratio = 97.82% (10,306/10,536).
![Page 17: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/17.jpg)
Challenge is presenting structured data SE’s can identify, parse and digest
Wolfinger, N. H., & McKeever, M. (2006, July). Thanks for nothing: changes in income and labor force participation for never-married mothers since 1982. In 101st American Sociological Association (ASA) Annual Meeting; 2006 Aug 11-14; Montreal, Canada (No. 2006-07-04, pp. 1-42). Institute of Public & International Affairs (IPIA), University of Utah.
Human Readable
Google ScholarUnderstandable
![Page 18: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/18.jpg)
Google Scholar can read and understand!Google Scholar
![Page 19: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/19.jpg)
SEO Organizational/Cultural Themes• Traditional SEO is an afterthought• Librarians think too small re potential traffic• Organizational communication is poor• Analytics are usually poorly implemented• Vendors are slow to catch on to SEO problems– Because we don’t demand it
![Page 20: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/20.jpg)
Recommended SEO Process1. Institutionalize SEO
● Strategic Planning● Accurate Measurement Tools
2. Traditional SEO● Get Indexed = Index Ratio● Get Visible = Search Engine Results Page (SERP)
![Page 21: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/21.jpg)
Advanced SEO Programs3. Semantic SEO
● Get Relevant = Click Through Ratios (CTR)● Semantic Identity● Schema.org for Libraries● Linked Open Data (LOD)
4. Social Media Optimization● Faculty Outreach
![Page 22: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/22.jpg)
SEMANTIC IDENTITY
For Accurate Representation on the Web
12/09/2014
![Page 23: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/23.jpg)
Current SituationAcademic organizations are poorly represented on the Semantic Web…
…because search engines don’t understand them…
…because we don’t maintain the data sources search engines trust.
![Page 24: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/24.jpg)
Affects reputation of the entire academic institution
Colleges
Departments Centers Institutes
![Page 25: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/25.jpg)
Institutional reputation
Researcher collaboration/employment
Research funding
University rankings
Student enrollment
Manage Risk
![Page 26: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/26.jpg)
Google’s Knowledge Graph
The Web is moving from “strings” to “things”
“A knowledge base … to enhance search results with semantic-search information gathered from a wide variety of sources”
Source: Wikipedia
![Page 27: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/27.jpg)
Knowledge Graph Products• Answer Box– Facts about concepts
• Carousel– Group of instances that comprise a concept
• Knowledge Card– Displays information about organizations and
people
![Page 28: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/28.jpg)
![Page 29: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/29.jpg)
![Page 30: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/30.jpg)
Lack of a Knowledge Card in search results is indicative of a larger problem…
…and as a result Google is unlikely to connect users with the organization’s website
…it means Google doesn’t understand that the organization exists or what its business is…
![Page 31: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/31.jpg)
Survey of ARL Libraries• n=125• Searched by name listed in ARL directory• Knowledge Card? Yes/No• Robustness scale of 1-5
![Page 32: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/32.jpg)
![Page 33: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/33.jpg)
![Page 34: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/34.jpg)
![Page 35: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/35.jpg)
Survey of ARL LibrariesNo Knowledge Card at all
43Have Knowledge Card
82 -10 incorrect
-29 (robustness of 1)Total = 43
![Page 36: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/36.jpg)
Google’s Perception of MSU Lib - 2012
![Page 37: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/37.jpg)
MSU Library - 2014
![Page 38: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/38.jpg)
Where does Google get its information?
![Page 39: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/39.jpg)
Trusted Sources for Search Engines• No Wikipedia presence? – Organization doesn’t exist as an “entity” or “thing”– It exists as a string of (confusing) text
• Other influences on Google’s Knowledge Graph– FreeBase (phasing out in favor of Wikidata)– Google Places/Google My Business– Google+
![Page 40: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/40.jpg)
Wikipedia - 2012
![Page 41: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/41.jpg)
![Page 42: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/42.jpg)
DBPedia entry - 2012
![Page 43: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/43.jpg)
2014 DBpedia entry
![Page 44: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/44.jpg)
MSU COLLEGES
![Page 45: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/45.jpg)
![Page 46: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/46.jpg)
![Page 47: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/47.jpg)
![Page 48: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/48.jpg)
![Page 49: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/49.jpg)
MSU CENTERS AND INSTITUTES
![Page 50: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/50.jpg)
![Page 51: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/51.jpg)
![Page 52: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/52.jpg)
![Page 53: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/53.jpg)
![Page 54: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/54.jpg)
Summary• Define library organization in Wikipedia– Beware of *pedia culture and process
• Engage with other trusted data sources– Wikidata– Google Places/Google My Business– Google+
• Mark-up metadata with Schema.org
![Page 55: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/55.jpg)
New Knowledge Work for Libraries• Build set of replicable services– Populate and maintain structured data records– Add rich semantic markup to websites
• Communicate– Understand ourselves from stakeholder perspective– Machine-understandable information
![Page 56: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/56.jpg)
SCHEMA.ORG PROJECTPart 3 of 3
![Page 57: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/57.jpg)
57
Schema.org • Common vocabulary for describing things on web • Supported by Bing, Google, Yahoo and Yandex • “On-page markup helps search engines
understand the information on webpages and provide richer results.”
• https://support.google.com/webmasters/answer/1211158?hl=en
![Page 58: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/58.jpg)
Hypothesis• Implementing Schema.org in library websites– Improves machine understanding of content– Improves rich snippets shown in SERP– Increases click-through rates from SERP
• Result– More traffic– More users finding what they’re looking for
![Page 59: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/59.jpg)
Project: A Controlled Experiment by Jason Clark (with Michelle Gollehon)
• Two digital collections• Similar size/content/date range– Photos and historical documents
• 1 optimized with Schema.org (Schultz)• 1 control (Brook)
![Page 60: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/60.jpg)
A Revised Digital Library Architecture• Collection Page (home page)
– arc.lib.montana.edu/schultz-0010/• About Pages (about page, topics page)
– arc.lib.montana.edu/schultz-0010/about.php• Item Pages (individual record page)
– arc.lib.montana.edu/schultz-0010/item/31• Sitemap and rel=canonical work
– arc.lib.montana.edu/schultz-0010/
![Page 61: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/61.jpg)
![Page 62: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/62.jpg)
![Page 63: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/63.jpg)
Results
![Page 64: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/64.jpg)
Semantic Web Team• Kenning Arlitsch, Dean @kenning_msu• Patrick OBrien, Semantic Web Director @sempob• Jeff Mixter, Research Associate, OCLC Research• Jason Clark, Head of Lib Informatics and Computing @jaclark• Scott Young, Digital Initiatives Librarian @hei_scott• Doralyn Rossmann, Head of Coll Development @doralyn• Jean Godby, Senior Research Associate, OCLC Research
![Page 65: Walk Before You Run: Prerequisites to Linked Data](https://reader030.fdocuments.in/reader030/viewer/2022032514/55d6e0e4bb61ebb10e8b4599/html5/thumbnails/65.jpg)
Relevant Publications• Arlitsch, Kenning, and Patrick S. OBrien. (2013) Improving the visibility and use of digital repositories through
SEO. Chicago: ALA TechSource. ISBN-13: 978-1-55570-906-8
• Mixter, Jeff, Patrick OBrien and Kenning Arlitsch. “Describing Theses and Dissertations using Schema.org,” Proceedings of the International Conference on Dublin Core and Metadata Applications 2014, Dublin Core Metadata Initiative: 138-146.
• Arlitsch, Kenning. “Being Irrelevant: How Library Data Interchange Standards have kept us off the Internet,” Journal of Library Administration, 54, no. 7 (2014): 609-619.
• Arlitsch, Kenning, Patrick OBrien, Jason A. Clark, Scott W.H. Young and Doralyn Rossmann. “Demonstrating Library Value at Network Scale: Leveraging the Semantic Web with New Knowledge Work,” Journal of Library Administration, 54, no. 5 (2014): 413-425.
• Arlitsch, Kenning, Patrick OBrien, and Brian Rossmann. "Managing Search Engine Optimization: An Introduction for Library Administrators." Journal of Library Administration 53, no. 2-3 (2013): 177-188.
• Arlitsch, Kenning, and Patrick S. O'Brien. "Invisible institutional repositories: Addressing the low indexing ratios of IRs in Google Scholar." Library Hi Tech 30, no. 1 (2012): 60-81.