METRIDOC: A Framework for Managing and Exposing Library Event Data
-
Upload
jena-juarez -
Category
Documents
-
view
27 -
download
0
description
Transcript of METRIDOC: A Framework for Managing and Exposing Library Event Data
METRIDOC: A Framework for Managing and Exposing Library Event Data
With the support of
University of Pennsylvania Libraries
METRIDOC University of Pennsylvania Libraries
Metrics start with a basic abstraction:
The Event
METRIDOC University of Pennsylvania Libraries
xxx.xx.xxx.xxx|-|zucca|[26/Jul/2007:15:41:01 -0500]| GET https://proxy.library.upenn.edu:443/login?proxySessionID=10335905&url=http://www.csa.com/htbin/dbrng.cgi?username=upenn3&access=upenn34&cat=psycinfo&adv=1 HTTP/1.1| 302|0|http://www.library.upenn.edu/cgibin/res/sr.cgi?community=59| Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/418.9.1 (KHTML, like Gecko) Safari/419.3| NGpmb6dT6JXswQH|__utmc=94565761;ezproxy=NGpmb6dT6JXswQH; hp=/; proxySessionID=10335514; __utmc=247612227; __utmz=247612227.1184251774.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none);UPennLibrary=AAAAAUaWP5oAACa4AwOOAg==; sfx_session_id=s6A37A3E0-3B8E-11DC-80E985076F88F67F
Viewing an Ejournal article. The Event as raw data
METRIDOC University of Pennsylvania Libraries
User & Program
Parameters
User & Program
Parameters
College | Dept
Rank
Course
Host College
Host Dept
Instructor
Grant Spnsr
Library Parameters
Library Parameters
Srvice Genre
Cognzt Staff
Orgn’l Unit
Budget cntr
Environmental Parameters
Environmental Parameters
Bibliographic Parameters
Bibliographic Parameters
Title
URI
Format
Cost| Supplr
Date | Time
Location
IP Domain
URL
EVENT
An Event Abstracted
METRIDOC University of Pennsylvania Libraries
Link resolver
Proxy server
COUNTER
ILS (Voyager, I3, Kuali-OLE)
Resource sharing system
Web server
Social networking Srvs.
Spreadsheets, databases
Other targets…
The “Event” is represented in machine-readable data, stored in a plethora of business systems.
E-Resource Use by service, demographic, package
Expenditures & Inventory planning /reader interest data
Supply chain data
Discovery systems & content use
Research & instructional datalearning management
Other events…
Event Types Source Target
Is a framework for :
Extracting event data from systems
Transforming those data into readable, normalized formats
Loading transformed/normalized payload into a repository
Supporting analysis through local and collaborative dissemination channels.
MetriDoc
METRIDOC University of Pennsylvania Libraries
Increased scope of sources
Synthesis of vectors, e.g. Expenditure per use Resources use by communities
Contextualized data with greater statistical dimension and descriptive power.
Collaborative assessment.
Improved Data Resolution Through Integration
METRIDOC University of Pennsylvania Libraries
METRIDOC University of Pennsylvania Libraries
Our legacy system: Datafarm
Perl
Perl
Perl
cron
Perl
Perl
Perl
cron
Perl
Perl
Perl
cron
Voyager
Farmer
Quaker
App Logs
METRIDOC University of Pennsylvania Libraries
Datafarm Shortcomings
Maintainability issues•Scripts that depend on each other located in different places•Perl is very productive as long as you are maintaining your own code•Doing the same thing over again, no code reuse•Lack of notification for success and failure
Not shareable•No safe way to expose data for collaboration•Generating data for a report can be a job in itself•Schemas are not stored in a sharable format
Not reusable•Doing the same thing over and over again without building libraries for common tasks•No central code repository to share libraries within and outside of UPenn
METRIDOC University of Pennsylvania Libraries
What we need? Who takes care of it
A central scheduler Jenkins
Notifications of job success or failure
Jenkins
Batch job / etl scripting framework
Metridoc
Exposing data Metridoc – Google data format
Reporting / Graphs Google Charts / R / Tableau / Other Stat Packages
Central Code Repository Maven Central via Sonatype Hosting
METRIDOC University of Pennsylvania Libraries
Current System: Metridoc
Perl
Perl
Perl
cron
Perl
Perl
Perl
cron
Perl
Perl
Perl
cron
Voyager
Farmer
Quaker
App Logs
METRIDOC University of Pennsylvania Libraries
Metridoc Philosophy
METRIDOC University of Pennsylvania Libraries
Scripting Framework
METRIDOC University of Pennsylvania Libraries
Scripting Example
METRIDOC University of Pennsylvania Libraries
Scripting Example
METRIDOC University of Pennsylvania Libraries
Exposing data
METRIDOC University of Pennsylvania Libraries
Metrics on the cheap (google charts)
METRIDOC University of Pennsylvania Libraries
Thoughts on complex statistics
METRIDOC University of Pennsylvania Libraries
The future
METRIDOC University of Pennsylvania Libraries
Abstracts 4 key functions, exposes interfaces for interoperability
Target Source, e.g. Relais, Illiad, ILS
Ingest Log
Parse
Format
Refined output
1. Extract
Resolution Sources e.g. IdM, WorldCat
Refined output
Resolve Codes & IDs
Normalize
2. Transform
Query Srvc
Data Repo
3. Load
User Interface
LocalData Stores
Results Document
Query Document
4. Query
METRIDOC University of Pennsylvania Libraries
Partners are welcome
Spo
nsor
More at http://code.google.com/p/metridoc/