IGeLU 2014

17
Analytics @ Lancaster University Library IGeLU 2014 John Krug, Systems and Analytics Manager, Lancaster University Library http://www.slideshare.net/jhkrug/igelu-analytics-2014

Transcript of IGeLU 2014

Analytics @ Lancaster University Library

IGeLU 2014John Krug, Systems and Analytics Manager, Lancaster University Libraryhttp://www.slideshare.net/jhkrug/igelu-analytics-2014

• We are in Lancaster in the UK North West.

• ~ 12,000 FTE students, ~ 2300 FTE Staff

• Library has 55 FTE staff, building refurbishment in progress

• University aims to be 10, 100 – Research, Teaching, Engagement

• Global outlook with partnerships in Malaysia, India, Pakistan and a new Ghana campus

• Alma implemented January 2013 as an early adopter.

• I am Systems and Analytics Manager, at LUL since 2002 to implement Aleph – systems background, not library

• How can library analytics help?

Lancaster University, the Libraryand Alma

• Following implementation of Alma, analytics dashboards rapidly developed for common reporting tasks

• Ongoing work in this area, refining existing and developing new reports

Alma Analytics reporting and dashboards

Results

Fun with BLISS

347 lines of this!

B Floor 9AZ (B)

Projects & Challenges

• LDIV – Library Data, Information & Visualisation• ETL experiments done using PostgresQL and Python• Data from Aleph, Alma, Ezproxy, etc.

• Smaller projects:• e.g. Re-shelving performance – required to use Alma Analytics

returns data along with the number of trolleys re-shelved daily.

• Challenges – Infrastructure, Skills, time• Lots of new skills/knowledge needed for Analytics. For us :

Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx, openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc, etc

Alma analytics data extraction

• Requires using a SOAP API (thankfully a RESTful API is now available for Analytics)

• SOAP support for python not very good, much better with REST. Currently using the suds python library with a few bug fixes for compression, ‘&’ encoding, etc.

• A script get_analytics invokes the required report, manages collection of multiple ‘gets’ if the data is large and produces a single XML file result.

• Needs porting from SOAP to REST.• Data extraction from Alma Analytics is straight forward,

especially with REST

• Ezproxy logs

• Enquiry/exit desk query statistics

• Re-shelving performance data

• Shibboleth logs, hopefully soon. We are dependent on central IT services

• Library building usage counts

• Library PC usage statistics

• JUSP & USTAT aggregate usage data

• University faculty and department data

• Social networking

• New Alma Analytics subject areas, especially uResolver data

Data from other places

• Currently we have aggregate data from JUSP, USTAT

• Partial off campus picture from ezproxy, but web orientated rather than resource

• Really want the data from Shibboleth and uResolver

• Why the demand for such low level data about individuals?

Gaps in the electronic resource picture

The library and learner analytics

• Learner analytics a growth field• Driven by a mass of data from VLEs and MOOCs …. and

libraries• Student satisfaction & retention• Intervention(?)

• if low(library borrowing) & low(eresource access) &high(rate of near late or late submissions) &low_to_middling(grades)thendo_something()

• The library can’t do all that, but the university could/can• Library can provide data

The library as data provider

• LAMP – Library Analytics & Metrics Project from JISC• http://jisclamp.mimas.ac.uk• We will be exporting loan and anonymised

student data for use by LAMP.• They are experimenting with dashboards

and applications• Prototype application later this year.• Overlap with our own project LDIV

• The Library API• For use by analytics projects within the university• Planning office, Student Services and others

The Library API

• Built using openresty, nginx, lua• Restful like API interface• e.g. Retrieve physical loans for a patron

• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml (or json)

<?xml version="1.0" encoding="UTF-8"?><response>

<record><call_no>AZKF.S75 (H)</call_no><loan_date>2014-07-10 15:44:00</loan_date><num_renewals>0</num_renewals><bor_status>03</bor_status><rowid>3212</rowid><returned_date>2014-08-15 10:16:00</returned_date><collection>MAIN</collection><rownum>1</rownum><material>BOOK</material><patron>b3ea5253dd4877c94fa9fac9</patron><item_status>01</item_status><call_no_2>B Floor Red Zone</call_no_2><bor_type>34</bor_type><key>000473908000010-200208151016173</key><due_date>2015-06-19 19:00:00</due_date>

</record></response>

[{"rownum": 1,"key": "000473908000010-200208151016173","patron": "b3ea5253dd4877c94fa9fac9","loan_date": "2014-07-10 15:44:00","due_date": "2015-06-19 19:00:00","returned_date": "2014-08-15 10:16:00","item_status": "01","num_renewals": 0,"material": "BOOK","bor_status": "03","bor_type": "34","call_no": "AZKF.S75 (H)","call_no_2": "B Floor Red Zone","collection": "MAIN","rowid": 3212

}]

How does it work?

• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml

• Nginx configuration maps REST url to database query

location ~ /ploans/(?<patron>\w+) {

## collect and/or set default parametersrewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt;

}

location ~ /ploans_paged/(?<patron>\w+):(?<start>\d+):(?<nrows>\d+)\.json {postgres_pass database;rds_json on;

postgres_query HEAD GET "select * from ploans where patron = $patron

and row >= $start and row < $start + $nrows";}

Proxy for making Alma Analytics API requests

• e.g. Analytics report which produces• nginx configuration

• So users of our API can get data directly from Alma Analytics and we manage the interface they useand shield them from any APIchanges at Ex Libris.

location /aa/patron_count {set $b "api-na.hosted.exlibri … lytics/reports";set $p "path=%2Fshared%2FLancas … tron_count";set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73";proxy_pass https://$b?$p&$k;

}

Re-thinking approaches

• Requirements workshops• Application development

• Data provider via API interfaces• RDF/SPARQL capability

• LDIV – Library Data, Information and Visualisation• Still experimenting• Imported data from ezproxy logs, GeoIP databases, student

data, primo logs, a small amount of Alma data• Really need Shibboleth and uResolver data• Tableau as the dashboard to these data sets

Preliminary results

More at http://public.tableausoftware.com/profile/john.krug#!/

• First UK Analytics SIG meeting Oct 14 following EPUG-UKI AGM

• Questions?