Forward in Reverse
-
Upload
eric-larson -
Category
Technology
-
view
1.305 -
download
2
description
Transcript of Forward in Reverse
![Page 1: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/1.jpg)
Forward in Reverse
A Gentle Overview Of Forward System Architecture
Eric, Mike & Steve – WiLSWorld 2010
![Page 2: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/2.jpg)
Outline
• Intro to Forward with Demo• Batch Processing (Backend)• Web Application (Frontend)• Challenges• Q&As throughout
![Page 3: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/3.jpg)
Intro & Demo
![Page 4: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/4.jpg)
http://forward.library.wisconsin.edu
![Page 5: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/5.jpg)
Batch Processing
![Page 6: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/6.jpg)
We have gobs & gobs of data.
![Page 7: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/7.jpg)
1) Extract it
![Page 8: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/8.jpg)
1a) ILS Data
![Page 9: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/9.jpg)
![Page 10: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/10.jpg)
![Page 11: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/11.jpg)
![Page 12: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/12.jpg)
![Page 13: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/13.jpg)
Sort, Deduplicate, Merge
![Page 14: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/14.jpg)
Antique Style KeyBy Stars*Go*Bluehttp://www.flickr.com/photos/artbydebora/1406682449/
![Page 15: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/15.jpg)
Common Identifier = OCLC Number
![Page 16: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/16.jpg)
Catalog Extract Processing Details
• 14 Voyager Instances• 13M MARC bibliographic records extracted• Approximately 14 hours• Local C code
Sorted, deduplicated and merged output:
• 8M records• 10GB Raw MARC data
![Page 17: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/17.jpg)
Why Merge?
• URLs• Formats• Holdings
![Page 18: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/18.jpg)
1b) Digital Collection Data
![Page 19: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/19.jpg)
![Page 20: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/20.jpg)
Fedora Extract Processing Details
• 1 Fedora Repository• 13K “First Class” XML Objects extracted• Approximately 4 hours• Repository query language
XML output:
• METS XML package • Structural XML• MODS Bibliographic XML
• 41MB XML data
![Page 21: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/21.jpg)
2) Index it
![Page 22: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/22.jpg)
We take raw library data and process it with MARC/XML parsing tools and local parsing rulesin order to build a Solr search index.
![Page 23: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/23.jpg)
1. Raw data (MARC & METS XML)
2. Parsing libraries (Java code: marc4j, SAXParser)
3. Local code that defines parsing rules
4. Solr index
![Page 24: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/24.jpg)
1. Raw data
![Page 25: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/25.jpg)
LEADER 02000cam a22003734a 45 0001 6939454005 20051208125417.0008 051104s2004 enka $b 001 0 eng 010 $a 2003045349 035 $a (OCoLC)ocm52165958 040 $a DLC $c DLC $d XMA $d BAKER $d UKM 015 $a GBA430162 $2 bnb 016 7 $a 012906573 $2 Uk 020 $a 0754605175 (alk. paper) 024 $a 99811375970 042 $a pcc 049 $a GZMA 050 00 $a B3376.W564 $b W55355 2004 082 00 $a 111/.85/092 $2 21 245 00 $a Wittgenstein, aesthetics, and philosophy / $c edited by Peter B. Lewis. 260 $a Aldershot, Hants, England ; $a Burlington, VT : $b Ashgate, $c c2004. 300 $a xii, 255 p. : $b ill. ; $c 24 cm. 440 0 $a Ashgate Wittgensteinian studies 505 0 $a Wittgenstein and the aesthetic domain / Kjell S. Johannessen -- 2.
Wittgenstein, anti-essentialism and the definition of art / Terry Diffey -- 3. Rules, creativity and pictures : Wittgenstein's Lectures on aesthetics / David Novitz -- 4. Criticism without theory / Mark W. Rove -- 5. On aesthetic reactions and changing one's mind / Lars Hertzberg -- 6. Wittgenstein and the arts : understanding and performing / Graham McFee -- 7. Wittgenstein's music / R.A. Sharpe -- 8. Wittgenstein on music and language / Oswald Hanfling -- 9. Ethics and aesthetics are one / Carolyn Wilde -- 10. Fiction and reality in the arts / Ilham Dilman -- 11. Literature, human understanding and morality / Ben Tilghman -- 12. 'The self, thinking' : Wittgenstein, Augustine and the autobiographical situation / Garry L. Hagberg
504 $a Includes bibliographical references (p. 235-247) and index.
![Page 26: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/26.jpg)
02000cam a22003734a 45 001000800000005001700008008004100025010001700066035002300083040003000106015001900136016001800155020002800173024001600201042000800217049000900225050002800234082002000262245007400282260006800356300003400424440003600458505081100494504006401305600005001369700002501419938007101444945001901515946003001534946001301564947002101577948001601598994001201614693945420051208125417.0051104s2004 enka b 001 0 eng a 2003045349 a(OCoLC)ocm52165958 aDLCcDLCdXMAdBAKERdUKM aGBA4301622bnb7 a0129065732Uk a0754605175 (alk. paper) a99811375970 apcc aGZMA00aB3376.W564bW55355 200400a111/.85/09222100aWittgenstein, aesthetics, and philosophy /cedited by Peter B. Lewis. aAldershot, Hants, England ;aBurlington, VT :bAshgate,cc2004. axii, 255 p. :bill. ;c24 cm. 0aAshgate Wittgensteinian studies0 aWittgenstein and the aesthetic domain / Kjell S. Johannessen -- 2. Wittgenstein, anti-essentialism and the definition of art / Terry Diffey -- 3. Rules, creativity and pictures : Wittgenstein's Lectures on aesthetics / David Novitz -- 4. Criticism without theory / Mark W. Rove -- 5. On aesthetic reactions and changing one's mind / Lars Hertzberg -- 6. Wittgenstein and the arts : understanding and performing / Graham McFee -- 7. Wittgenstein's music / R.A. Sharpe -- 8. Wittgenstein on music and language / Oswald Hanfling -- 9. Ethics and aesthetics are one / Carolyn Wilde -- 10. Fiction and reality in the arts / Ilham Dilman -- 11. Literature, human understanding and morality / Ben Tilghman -- 12. 'The self, thinking' : Wittgenstein, Augustine and the autobiographical situation / Garry L. Hagberg aIncludes bibliographical references (p. 235-247) and index.10aWittgenstein, Ludwig,d1889-1951xAesthetics.1 aLewis, Peter,d1947- aBaker & TaylorbBKTYc99.95d99.95i0754605175n0004227086sactive c1d89087961587 a714694b2005-11-23c81.86 c99.95d1 aHEUR 4801bm,stk aSCNd348032 a92bGZM
![Page 27: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/27.jpg)
![Page 28: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/28.jpg)
2. MARC/XML parsing libraries
![Page 29: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/29.jpg)
![Page 30: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/30.jpg)
![Page 31: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/31.jpg)
02000cam a22003734a 45 001000800000005001700008008004100025010001700066035002300083040003000106015001900136016001800155020002800173024001600201042000800217049000900225050002800234082002000262245007400282260006800356300003400424440003600458505081100494504006401305600005001369700002501419938007101444945001901515946003001534946001301564947002101577948001601598994001201614693945420051208125417.0051104s2004 enka b 001 0 eng a 2003045349 a(OCoLC)ocm52165958 aDLCcDLCdXMAdBAKERdUKM aGBA4301622bnb7 a0129065732Uk a0754605175 (alk. paper) a99811375970 apcc aGZMA00aB3376.W564bW55355 200400a111/.85/09222100aWittgenstein, aesthetics, and philosophy /cedited by Peter B. Lewis. aAldershot, Hants, England ;aBurlington, VT :bAshgate,cc2004. axii, 255 p. :bill. ;c24 cm. 0aAshgate Wittgensteinian studies0 aWittgenstein and the aesthetic domain / Kjell S. Johannessen -- 2. Wittgenstein, anti-essentialism and the definition of art / Terry Diffey -- 3. Rules, creativity and pictures : Wittgenstein's Lectures on aesthetics / David Novitz -- 4. Criticism without theory / Mark W. Rove -- 5. On aesthetic reactions and changing one's mind / Lars Hertzberg -- 6. Wittgenstein and the arts : understanding and performing / Graham McFee -- 7. Wittgenstein's music / R.A. Sharpe -- 8. Wittgenstein on music and language / Oswald Hanfling -- 9. Ethics and aesthetics are one / Carolyn Wilde -- 10. Fiction and reality in the arts / Ilham Dilman -- 11. Literature, human understanding and morality / Ben Tilghman -- 12. 'The self, thinking' : Wittgenstein, Augustine and the autobiographical situation / Garry L. Hagberg aIncludes bibliographical references (p. 235-247) and index.10aWittgenstein, Ludwig,d1889-1951xAesthetics.1 aLewis, Peter,d1947- aBaker & TaylorbBKTYc99.95d99.95i0754605175n0004227086sactive c1d89087961587 a714694b2005-11-23c81.86 c99.95d1 aHEUR 4801bm,stk aSCNd348032 a92bGZM
![Page 32: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/32.jpg)
![Page 33: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/33.jpg)
![Page 34: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/34.jpg)
![Page 35: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/35.jpg)
3. Local code
![Page 36: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/36.jpg)
![Page 37: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/37.jpg)
![Page 38: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/38.jpg)
http://lucene.apache.org/solr/
4.
![Page 39: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/39.jpg)
What is Solr?
An XML API over a Lucene search index.
![Page 40: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/40.jpg)
![Page 41: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/41.jpg)
![Page 42: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/42.jpg)
Access to Raw Formats
Raw MARC stored for Merged record
Live calls made to Fedoraweb services
![Page 43: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/43.jpg)
Data Refresh
Bibliographic: weekly
Circulation status: nightly
![Page 44: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/44.jpg)
![Page 45: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/45.jpg)
For more information, seehttp://sdg.library.wisc.edu/blog/2010/03/03/solr-marc-indexing-based-on-diffs/
![Page 46: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/46.jpg)
Web Application
![Page 47: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/47.jpg)
Frontend?
• (X)HTML• JavaScript• Cascading Style Sheets• Design
– Information Architecture– User experience– Chrome (images, icons, pretty)
![Page 48: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/48.jpg)
Forward Colophon• ActiveRecordBaseWithoutTable (Rails
plugin)• Apache• Blacklight (Rails plugin)• Blueprint CSS• Bookreader (jQuery)• Capistrano• Crontab• Engines (Rails plugin)• Fedora• Freebase API• GeoIP (Ruby gem)• Google Books API• Haml (Rails plugin)• Happymapper (Ruby gem)• HathiTrust API• jQuery
• Ken (Ruby gem)• LowPro (Prototype JS)• MARC4J• Passenger (modrails)• Prototype JS• PostgreSQL• Raphael• Ruby on Rails• Shibboleth• Subversion• Solr / Lucene• Summon (Ruby gem)• UW-Madison Libraries Staff Directory API• UWDC (Rails plugin)• Voyager API• Tender love and attention
![Page 49: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/49.jpg)
Campus Affiliation
Users localize to a school, allows us scope many features to their campus.
GeoIP RubyGemMatch IP addresses with physical locations.
Raphaël—JavaScript Library “Small JavaScript library that should simplify your work with vector graphics on the web”.
![Page 50: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/50.jpg)
Raphaël
SVG elements, like the circles and squares in the Forward splash page, can be treated as XHTML elements allowing us to manipulate them with JavaScript and CSS.
http://raphaeljs.com/
![Page 51: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/51.jpg)
Campus Homepage
Forward application stack:
•Apache+Passenger (modrails)•Ruby on Rails•PostgreSQL•Apache Solr
![Page 52: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/52.jpg)
Apache+Passenger
Phusion Passenger is an Apache module, which makes deploying Ruby and Ruby on Rails applications on Apache a breeze.
http://www.modrails.com/
![Page 53: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/53.jpg)
Ruby on Rails
“Ruby on Rails is an open-source web framework that’s optimized for programmer happiness and sustainable productivity.”
http://rubyonrails.org/
![Page 54: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/54.jpg)
PostgreSQL
“PostgreSQL is a powerful, open source object-relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness.”
http://www.postgresql.org/
![Page 55: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/55.jpg)
Apache Solr
“Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling.”
http://lucene.apache.org/solr/
![Page 56: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/56.jpg)
Results
![Page 57: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/57.jpg)
Results – Three columnsFacets Results Context
![Page 58: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/58.jpg)
Results – Data sourcesFacets Results Context
SolrSolr +
PostgreSQL + APIs
APIs
![Page 59: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/59.jpg)
Results – Facets – SolrFacets
![Page 60: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/60.jpg)
Results – Solr + PostgreSQL + APIsResults
![Page 61: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/61.jpg)
Results – Context – APIs Context
Bing API Freebase API Google API Libraries Staff Dir. API
![Page 62: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/62.jpg)
Results – Three main columnsFacets Results Context
![Page 63: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/63.jpg)
Results – CSS gridFacets Results Context
![Page 64: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/64.jpg)
Blueprint
“Blueprint is a CSS framework, which aims to cut down on your development time. It gives you a solid foundation to build your project on top of, with an easy-to-use grid, sensible typography, useful plugins, and even a stylesheet for printing.”
http://blueprintcss.org/
![Page 65: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/65.jpg)
![Page 66: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/66.jpg)
Show – Book
![Page 67: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/67.jpg)
Show – Image
![Page 68: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/68.jpg)
Show – Full Text Book
![Page 69: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/69.jpg)
Show – View Full Text Book
![Page 70: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/70.jpg)
BookReader
“The Internet Archive BookReader is used to view books from the Internet Archive online and can also be used to view other books. ”
http://github.com/openlibrary/bookreader
![Page 71: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/71.jpg)
Challenges
![Page 72: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/72.jpg)
Challenges
• Merging MARC, METS extracts• Batch processing time (Time/CPU constraints)• Page level indexing (Bookviewer - memory/disk
constraints)• Voyager API• Organization challenges
– big project, small shop– dealing with vendor silos– multiple cataloging standards– quality of services challenges
![Page 73: Forward in Reverse](https://reader030.fdocuments.in/reader030/viewer/2022012922/54b350314a7959ae248b45b0/html5/thumbnails/73.jpg)
Thanks!