EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national...

43
EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources DOMINIQUE LECHAUDEL - INIST-CNRS THOMAS JOUNEAU - UNIVERSITE DE LORRAINE

Transcript of EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national...

Page 1: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and

fine-grained access events to electronic resources

DOMINIQUE LECHAUDEL - INIST-CNRS

THOMAS JOUNEAU - UNIVERSITE DE LORRAINE

Page 2: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Product owner of ezMESURE project at INST-CNRS. The CNRS, the French Center for Scientific Research

Dominique Lechaudel

Page 3: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

E-librarian @ Université de Lorraine

EzPAARSE and ezMESURE userCo-animator of the Couperin.org « Indicators » WG, memberof the Project COUNTER Executive Committee

Thomas Jouneau

Page 4: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

EzPAARSE and ezMESURE :

Assembling national dashboards from

locally generated and fine-grained

access events to electronic resources

Page 5: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezPAARSE and ezMESURE

ezPAARSE

The free and open source software

produces uniform electronic resourcesusage data

ezMESURE

Our national repository and

dashboard tool to visualize

ezpaarse collected data

http://ezpaarse.org

Page 6: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

What is ezPAARSE ?

• Free open source software

• Specialized niche software

• Locally installed by institutions

• Produces uniform electronic resources usage data

BiblioMap shows

ezPAARSE running live

http://bibliomap.inist.fr

Page 7: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

What is ezPAARSE ?

Log files

Access Events

files

- Filter

- Identifiy

- Enrich

- Encrypt

- …

Page 8: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

…………

…………

How does ezPAARSE work ?107.206.236.51 - - - [31/Dec/2017:23:06:56] "GET http://insb.bib.cnrs.fr:80/login?url=http://www.sciencedirect.com/science/journal/13698486

HTTP/1.1" 302 0 -

107.206.236.51 OvIjRz - [email protected]_O_CNRS_I_DS53_OU_UMR8197 [31/Dec/2017:23:07:28] " HTTP/1.1 GET

https://www.ncbi.nlm.nih.gov:443/pmc/articles/PMC5511345/pdf/ncomms16088.pdf HTTP/1.1 " 200 108723 insb

92.91.207.211 jjGjY9Q - 16SBIUMR7255_O_CNRS_I_DS53_OU_UMR7255 [01/Jan/2018:02:44:07] " GET https://cdn.els-

cdn.com:443/sd/css/css_gen_v01_1712R2.css" 200 3410106 insb

83.221.104.173 3aVq1a - [email protected]_O_CNRS_I_DS53_OU_UMR7275 [01/Jan/2018:05:02:38] "GET

http://www.physiology.org:80/doi/pdf/10.1152/ajplung.00348.2002 HTTP/1.1" 200 612360 insb

91.140.193.126 NZBUs1 - [email protected]_O_OTHER_I_DS53_OU_UMR7213 [01/Jan/2018:06:53:18] "GET

http://www.sciencedirect.com:80/science/article/pii/S0009308416300159/pdfft?pid=1-s2.0-S0009308416300159-main.pdf HTTP/1.1" 200 7890 insb

log lines

Access Events

174 parsers

datehost login mimeprint_identifierplatform publication_title doi geoipunitidsession

parsers are dedicated to the various recognized platforms

ACM

AIP

Nature

Springer

Ovid

ScienceDirect

Page 9: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

……

………………

How does ezPAARSE work ?

http://www.sciencedirect.com:80/science/article/pii/S0009308416300159/pdfft?pid=1-s2.0-S0009308416300159-main.pdf HTTP/1.1

Access Events

Springer

Ovid

ScienceDirect

parser 16 middlewares

datehost login mimeprint_identifierplatform publication_title doi geoipunitidsession

URL in the log line

middlewares perform particulartasks like crossref enrichment

filter

deduplicator

enhancer

crossref

Page 10: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Example of an ezPAARSE output

KBART fields

De

dup

lica

ted

acce

ss e

vents

CO

UN

TE

R r

eco

mm

end

atio

n

Text file

(CSV or JSON format)

Geoip fields

Page 11: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Librarian and computer scientist collaborate to

produce parsers

SpringerSpringer

ScienceDirect174 parsers

http://analyses.ezpaarse.org

Page 12: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

How does ezPAARSE works ?

http://analyses.ezpaarse.orgTrello Account

Maintaining and expanding ezPAARSE recognition capacity to a new platform is a collaborative work

Librarians

Page 13: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

How does ezPAARSE works ?

Computer scientist

ARC

http://analyses.ezpaarse.org

Maintaining and expanding ezPAARSE recognition capacity to a new platform is a collaborative work

Page 14: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezPAARSE installation - usageVery easy use :

- 5 minutes for installation from GitHub

- 5 minutes for its configuration

- Fully automatable treatments - Fully automatable updates

- Web interface

- Command line

- docker container available

Page 15: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezPAARSE worldwide

• We target ~130 french institutions (with a majority

of universities) declaring using a reverse-proxy

• 80 of them explicitly declared being interested

• 40 have a proper logformat parameter defined and

tested at least a log sample

• 50 have installed and use ezPAARSE on a regular basis

• ~120 instance installations out of France

• 60 in the USA

Page 16: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezPAARSE / ezMESURE Ecosystem

Page 17: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

What is ezMESURE ?

Access Events

files

Dashboards

- Stock

- Aggregate

- Compare

- Visualize

- Highlight- …

Page 18: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Université de Lorraine - Pioneer times : Designing,

installing, testing, strengthening a local ezPAARSE

installation

Page 19: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Using ezPAARSE• The key question : how to merge

ezPAARSE output with valid,

local data such as : the cursus,

the research lab, the user’s

status

• I’ll expose how we did it at the

Université de Lorraine, in 2

different ways (one currentlyactive, the other we wish to

implement more completely)

?

Page 20: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

The context, the goalsContext : how we became a pilot institution

● University born from a merger in 2012 but we use ezproxy since 2009

● All accesses (distant AND local) go through the reverse proxy

● A geographical proximity with the Inist, making collaborations easier

Goals

● Supplement the publisher statistics with data regarding the non-Counter platforms

● Deepen publisher statistics with user profile data

● Use the data and indicators produced as a steering tool (documentary policy, service delivery)

Resources

• 1 librarian, 1 technician (part-time for both)

Page 21: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Brief chronology (UL – Inist – Couperin)

2012

2013

2014

Product vision

Version 0.1

First experiments at U. de Lorraine

First dashboards!

Consolidation of the UL installation

Results files accessed through a web interface

2015-2017AGIMUS project

New way to characterize the events.

Page 22: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezproxy and cgi

script (local user database)

Output

(anonymised,

enriched)

ezpaarse

-results

raw ezproxy logs

ezproxy logs

enriched by the

CGI script and

anonymised,

secured for later use.

Univ ersity

authentication

sy stem

(LDAP + CAS)

The main installation framework since 2014

Post-processingand dashboards

Standard ezPAARSE output

+

Local fields

?Logs into

Page 23: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Select local relevant fields

Possible affiliations list

BC list

ETAPE repository

Early contacts with IT teams have allowed us to obtain a regular access to the BC and affiliation lists (and their updates)

User affiliation

User status« Business Category »

« ETAPE » code(national cursus reference code)

Available data...

... are translated with the help of static lists during

the post-processing phase (Visokio).

Page 24: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

User related fields

Standard ezPAARSE fields

Standard VS local

fields

Page 25: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

TOC (Table of contents) et ABS (tracts) excluded : 2014 : More than 1 700 000 ECs2015 : More than 2 000 000 ECs2016 : More than 3 000 000 ECs2017 : More than 4 500 000 ECs...

Size of output of an active month once compressed :12 Mb in 201450 Mb in 2017.

1 line is 1 consultation event

A server internally accessible with a nice web-based interface allows to retrieve daily ezPAARSE outputs, and monthly

concatenations

Access to the output files

Page 26: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Post-processing(1) : Research and academic units

Research : 62 possible research units in 10 « Poles »

A2F : Agronomie, agroalimentaire, f orêt

BMS : Biologie, médecine, santé

CPM : Chimie et phy sique moléculaires

M4 : Matière, matériaux, métallurgie, mécanique

TELL : Temps, espaces, lettres, langues

AM2I : Automatique, mathématiques, inf ormatique et

leurs interactions

CLCS : Connaissance, langage, communication, sociétés

EMPP : Énergie, mécanique, procédés, produits

OTELo : Observ atoire Terre et env ironnement de Lorraine

SJPEG : Sciences Juridiques, Politiques, Économiques et

de Gestion

Academic departments and

« Collegiums »Arts, lettres et langues (ALL)

Droit, économie, gestion (DEG)

Lorraine – INP (écoles d'ingénieurs)

Lorraine Management Innovation (LMI)

InterfaceSanté

Sciences et technologies

Sciences humaines et sociales (SHS)

Technologie

Each patron can have up to 4

affiliations, and very often 2

For every CE we take the first affiliation of each kind : Teaching, Research, Administration.

We add extra information regarding each of those affiliations (the « real » name, the group in the organigram...

The « teaching » affiliations are attached with their respective collegiums and the « research » with their poles.

There are more than 1000 possible

affiliations

Page 27: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Post-processing(2) : Business categories

Page 28: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Local data (3) : « ETAPE » codes

The « ETAPE » Codes are a national repository for every cursus and year in France.

This information when and once obtained allows to know the level (year) of the students.

Page 29: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Ex. 1 : Academic departments in Law and Economy

Page 30: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Ex.2 : Research lab in materials science

Page 31: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Participation to AnalogIST

Page 32: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Université de Lorraine - Interactive dashboards :

ezMESURE, AGIMUS (2016 and beyond)

xkc

d.c

om

Page 33: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezPAARSE and AGIMUSAgimus is a national project used

by IT teams in some universities.It works very similarly to ezPAARSE but with every

other electronic service (Intranet, wi-fi accesses,

Moodles…)

Page 34: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezproxyand

CGI script (local user database)

Output

(anonymised,

enriched)

ezPAARSE output

- ev ery ezPAARSE f ield

- Business category (user category )

- Af f iliation

- ETAPE (National student cursus

ref erence number)

- encry pted logins

ezpaarse

-results

Ezproxy logs are enriched directly on the ezproxy server using

a custom script also used for managing the different

authentication levels (for ex. local access vs. local+distant).

EzPAARSE output files are analysed and post-processed by

T. Jouneau with Omniscope Visokio, and dashboards are

produced.

raw ezproxy logs

ezproxy logs

enriched by the

CGI script and

anonymised,

secured for later use.

User related fields

LDAP

Standard ezPAARSE fields

Past (2014-2017) : The main installation framework (quick reminder)

Post-processingand dashboards

Page 35: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezproxyand

CGI script (local user database) 2 distinct outputs

Traditional ezPAARSE output

- ev ery ezPAARSE f ield

- Business category (user category )

- Af f iliation

- ETAPE (National student cursus

ref erence number)

- encry pted logins

ezpaarse

-results

Ezproxy logs are enriched directly on the ezproxy server using a custom script also used for managing the different authentication levels (for ex. local access vs. local+distant).

EzPAARSE output files are analysed and post-processed by T. Jouneau with Omniscope Visokio, and dashboards are

produced.

Simultaneously output files with only the unencrypted login are

sent to Agimus which enrich them with LDAP data based on the login, before sending them to ezMESURE.

raw ezproxy logs

ezproxy logs

enriched by the

CGI script and

anonymised,

secured for later use.

LDAP

Present (2017) : Workflow ezProxy → ezPAARSE → Agimus → ezMESURE

Post-processingand dashboards

Unencrypted ezPAARSE output

- ev ery ezPAARSE f ield

- no user data except

logins (not encry pted)

LDAP

Page 36: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

ezproxy ezproxy logs

enriched by

Agimus &

anony mised,

secured f or later

use.

ezpaarse

-results

No more CGI script. AGIMUS brings directly the local fields to the ezproxy logs, which are then secured for later use and reparsings.

Agimus does not send data to ezMESURE anymore. We have a « natural » ezPAARSE > ezMESURE workflow.

While Visokio may still be used for some time it will realistically be phase out eventually.

Frequent reparsings

Future (2018?) : Redesigning the workflow

Page 37: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Ex. 3 (ezMESURE) : platform profiles

Page 38: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Ex.4 (ezMESURE) : JSTOR profile

Page 39: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Ex.5 (ezMESURE) : Academic departments

Page 40: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Ex.6 (ezMESURE) : Academic departments in Sciences

Page 41: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Ex.7 (ezMESURE) : Rsearch labs and poles

Université de Lorraine -Université de Lorraine -

Page 42: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

More informations

http://ezpaarse.orghttp://analyses.ezpaarse.org

https://ezmesure.couperin.org

To contact us:

https://twitter.com/[email protected]

To Collaborate

https://github.com/ezpaarse-project/*

Live demo

Page 43: EzPAARSE and ezMESURE - OCLC · 2020-06-02 · EzPAARSE and ezMESURE : Assembling national dashboards from locally generated and fine-grained access events to electronic resources

Thank you

Dominique LechaudelINIST-CNRS

[email protected]

Thomas JouneauUNIVERSITÉ DE LORRAINE

[email protected]