IGeLU 2014
Transcript of IGeLU 2014
Analytics @ Lancaster University Library
IGeLU 2014John Krug, Systems and Analytics Manager, Lancaster University Libraryhttp://www.slideshare.net/jhkrug/igelu-analytics-2014
• We are in Lancaster in the UK North West.
• ~ 12,000 FTE students, ~ 2300 FTE Staff
• Library has 55 FTE staff, building refurbishment in progress
• University aims to be 10, 100 – Research, Teaching, Engagement
• Global outlook with partnerships in Malaysia, India, Pakistan and a new Ghana campus
• Alma implemented January 2013 as an early adopter.
• I am Systems and Analytics Manager, at LUL since 2002 to implement Aleph – systems background, not library
• How can library analytics help?
Lancaster University, the Libraryand Alma
• Following implementation of Alma, analytics dashboards rapidly developed for common reporting tasks
• Ongoing work in this area, refining existing and developing new reports
Alma Analytics reporting and dashboards
Projects & Challenges
• LDIV – Library Data, Information & Visualisation• ETL experiments done using PostgresQL and Python• Data from Aleph, Alma, Ezproxy, etc.
• Smaller projects:• e.g. Re-shelving performance – required to use Alma Analytics
returns data along with the number of trolleys re-shelved daily.
• Challenges – Infrastructure, Skills, time• Lots of new skills/knowledge needed for Analytics. For us :
Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx, openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc, etc
Alma analytics data extraction
• Requires using a SOAP API (thankfully a RESTful API is now available for Analytics)
• SOAP support for python not very good, much better with REST. Currently using the suds python library with a few bug fixes for compression, ‘&’ encoding, etc.
• A script get_analytics invokes the required report, manages collection of multiple ‘gets’ if the data is large and produces a single XML file result.
• Needs porting from SOAP to REST.• Data extraction from Alma Analytics is straight forward,
especially with REST
• Ezproxy logs
• Enquiry/exit desk query statistics
• Re-shelving performance data
• Shibboleth logs, hopefully soon. We are dependent on central IT services
• Library building usage counts
• Library PC usage statistics
• JUSP & USTAT aggregate usage data
• University faculty and department data
• Social networking
• New Alma Analytics subject areas, especially uResolver data
Data from other places
• Currently we have aggregate data from JUSP, USTAT
• Partial off campus picture from ezproxy, but web orientated rather than resource
• Really want the data from Shibboleth and uResolver
• Why the demand for such low level data about individuals?
Gaps in the electronic resource picture
The library and learner analytics
• Learner analytics a growth field• Driven by a mass of data from VLEs and MOOCs …. and
libraries• Student satisfaction & retention• Intervention(?)
• if low(library borrowing) & low(eresource access) &high(rate of near late or late submissions) &low_to_middling(grades)thendo_something()
• The library can’t do all that, but the university could/can• Library can provide data
The library as data provider
• LAMP – Library Analytics & Metrics Project from JISC• http://jisclamp.mimas.ac.uk• We will be exporting loan and anonymised
student data for use by LAMP.• They are experimenting with dashboards
and applications• Prototype application later this year.• Overlap with our own project LDIV
• The Library API• For use by analytics projects within the university• Planning office, Student Services and others
The Library API
• Built using openresty, nginx, lua• Restful like API interface• e.g. Retrieve physical loans for a patron
• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml (or json)
<?xml version="1.0" encoding="UTF-8"?><response>
<record><call_no>AZKF.S75 (H)</call_no><loan_date>2014-07-10 15:44:00</loan_date><num_renewals>0</num_renewals><bor_status>03</bor_status><rowid>3212</rowid><returned_date>2014-08-15 10:16:00</returned_date><collection>MAIN</collection><rownum>1</rownum><material>BOOK</material><patron>b3ea5253dd4877c94fa9fac9</patron><item_status>01</item_status><call_no_2>B Floor Red Zone</call_no_2><bor_type>34</bor_type><key>000473908000010-200208151016173</key><due_date>2015-06-19 19:00:00</due_date>
</record></response>
[{"rownum": 1,"key": "000473908000010-200208151016173","patron": "b3ea5253dd4877c94fa9fac9","loan_date": "2014-07-10 15:44:00","due_date": "2015-06-19 19:00:00","returned_date": "2014-08-15 10:16:00","item_status": "01","num_renewals": 0,"material": "BOOK","bor_status": "03","bor_type": "34","call_no": "AZKF.S75 (H)","call_no_2": "B Floor Red Zone","collection": "MAIN","rowid": 3212
}]
How does it work?
• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml
• Nginx configuration maps REST url to database query
location ~ /ploans/(?<patron>\w+) {
## collect and/or set default parametersrewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt;
}
location ~ /ploans_paged/(?<patron>\w+):(?<start>\d+):(?<nrows>\d+)\.json {postgres_pass database;rds_json on;
postgres_query HEAD GET "select * from ploans where patron = $patron
and row >= $start and row < $start + $nrows";}
Proxy for making Alma Analytics API requests
• e.g. Analytics report which produces• nginx configuration
• So users of our API can get data directly from Alma Analytics and we manage the interface they useand shield them from any APIchanges at Ex Libris.
location /aa/patron_count {set $b "api-na.hosted.exlibri … lytics/reports";set $p "path=%2Fshared%2FLancas … tron_count";set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73";proxy_pass https://$b?$p&$k;
}
Re-thinking approaches
• Requirements workshops• Application development
• Data provider via API interfaces• RDF/SPARQL capability
• LDIV – Library Data, Information and Visualisation• Still experimenting• Imported data from ezproxy logs, GeoIP databases, student
data, primo logs, a small amount of Alma data• Really need Shibboleth and uResolver data• Tableau as the dashboard to these data sets