UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really...
-
Upload
uksg-connecting-the-knowledge-community -
Category
Education
-
view
445 -
download
0
Transcript of UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really...
UKSG PRESENTATION PUBLISHER SOLUTIONS INTERNATIONAL, KEITH ABBOTT AND CHARLIE WHITE
APRIL 2016BOURNEMOUTH
Publisher Solutions International
• Established in 2005• Initial focus on the identification, case
development, and remediation efforts relating to subscription abuse.
• Specifically created to serve as an independent third party enabling STM publishing industry to benefit from the aggregation and analysis of confidential data without competitive or anti-trust concerns.
Transition to IP Address Verification Work
• PSI customers asked us to expand fraud identification work to include IP Address/Site License business.
• As part of this effort, PSI and Wiley created JV to conduct global clean-up of IP address data.
• Proprietary database of >50k institutions & >1 billion IPs• Data from 150+ publishers• US data (last significant territory) to be completed by
March 2016
Key Takeaway• State of IP Address data and management
of same within the STM industry is poor.• ca. 58% of IP Address data requires further investigation– e.g.
Territory Lines of Data Red Amber Green
France 63,071 4% 58% 38%
Germany 58,145 5% 43% 52%
China 149,435 1% 64% 35%
Avg/Total 270,651 3% 58% 39%
Takeaways from completion of IP Address Clean Up
• Publisher and even library data is universally poor.• Poor IP Address data extends far beyond initial
expectation that problems would be primarily attributed to fraud.
• Neither publishers nor libraries are equipped to address the problem and maintain long-term solution.
• Resource requirements for publishers and libraries alike is overwhelming for current systems, processes, and budgets – even at existing levels of inaccuracy.
• Keeping IP Address data clean provides no competitive advantage – but not doing so presents significant risk on many levels.
Associated Risks/Problems
• Easy to insert false IP addresses into systemswith no inherent checks
• Wrong IP addresses on accounts result in false usage reporting• Incorrect usage reporting carries significant implications for
pricing and widely used marketing metrics across industry• Fraud can go undetected for years• IP data errors create “openings” for illegal proxy/downloading• Open Access publishers have little or no idea where usage is
coming from• Data gets dirty as fast as it is cleaned
Institution APublisher 1
CURRENT STATEUNVETTED IP ADDRESS CHANGES/ADDITIONS
(Largely Manual Data Entry)
Publisher 2
Publisher 3
Publisher 4
Institution B
Institution C
Institutions Changes Publishers Unvetted Changes
70K 1 5.5K 3.85M Annual
70K 5 5.5K 1.93B Annual
70K 10 5.5K 3.85B Annual
IP Address Segmentation at Wiley
Brief Introduction• Keith Abbott, 25 years in industry from a journals
fulfilment background• Current emphasis is on content licensing and underlying
data supporting access to content• Team of two people checking licenses and IP address data• My focus is on IP address data issues confronting industry• Working with PSI for eleven years to audit IP addresses
Print was difficult to handle
Is online access any better?
• And they all have the same IP address range
• 134.245.*.*
• University of Kiel• GEOMAR• IPN• ZBW (Kiel)• ZBW (Hamburg)• Christian Albrechts Universität zu Kiel• UKSH• Helmholtz-Zentrum für Ozeanforschung Kiel• German National Library of Economics• University Hospital Schleswig Holstein (Kiel)• Institut für die Pädagogik der
Naturwissenschaften und Mathematik an der Universität Kiel
• HWWA
Getting Better – we have got it down to six!
• University of Kiel• University Hospital Schleswig Holstein (Kiel)• GEOMAR• IPN• German National Library of Economics (Hamburg)• German National Library of Economics (Kiel)
• But they are all still sharing the same IP address
• 134.245.*.*
IP addresses must be split out per location
University Hospital Schleswig Holstein (Kiel)134.245.121-255.*
German National Library of Economics (Kiel)134.245.101-110.*
GEOMAR134.245.1-50.*
IPN134.245.51-60.*
German National Library of Economics (Hamburg)
134.245.110-120.*
University of Kiel134.245.61-100.*
What can we learn from this example?
• Data is complex and confusing with multiple namesacronyms and English/native language variants
• IP addresses in addition to database accounts must be accurately segmented
• Failure to maintain correct IP address information could lead to access being inappropriately shared or customers losing access
• A publisher must check their underlying data matches their license agreements
• Bad IP address data will lead to incorrect usage statistics
Introduction
• Charlie White, Senior Customer Service Advisor.
• Working on a day to day basis with Institutions, Individuals and Agents
• SAGE has been working with PSI on both Print and IP Fraud investigations for the past 7 years.
• I will be focusing on IP Fraud
What is IP Fraud?
• Fraud definition - wrongful or criminal deception intended to result in financial or personal gain.
• How is it achieved in Publishing? It starts with data.• Publisher contacted by agent with a list of IP ranges for a mutual
customer.• Publisher trusts the IP ranges are correct and uploads onto their
system.• Hidden in the customer’s genuine IPs is a range owned by the
agent.• Publisher has unknowingly opened all the customer’s content to
the agent.• Back to our definition. What goes the agent gain?
Case Study
• A large Thai Agent “Agent X”was an subscription agent based in Thailand.
• Agent investigated by PSI initially for Print Fraud leading to publishers stopping all business with the agent.
• Agent X attempt to get around the ban.• Despite many negotiations, Agent X fail to settle and their
accounts are put on hold for good.• PSI approached by a ‘whistleblower’ with information
concerning the agent’s business practices. • The agent also involved in IP Fraud.
Case Study
Case Study
What can we do to prevent this?
• IP Audits.
• Stop ‘rogue’ Subscription Agents from placing orders with us.
• A greater understanding within the Industry as a whole of IP abuse and the importance of keeping accurate and up-to-date information.
Moving Forward
• Industry needs a practical, economically viable,and effective solution for managing IP Addressdata and enabling publishers to gain a better understanding of who their customers are:– Institutions accessing data– Potential customers visiting publishers– Authors contributing to publishers
On-line IP Register
Unrestricted internet users
RegisteredPublisher or Agent
Publisher or Agent
RegisteredInstitution
Institution
Basic lookup
Request to be added to DB
Register themselves
Detailed lookup of their data
Request change to their data
Detailed lookup of any data
Request to register themselves
Request to add an institution
Request change to any data
All requests
LONGTERM SOLUTION: CENTRALIZED IP ADDRESS REGISTRY
• Create a global IP address database for allPublishers to use and establish long-termindustry standard
• Clean up all publisher authentication databases• Verify all new IP additions and changes• Check Publisher Log Files against IP database for
abuse detection and usage anomalies• Enlist support from library community to keep the IP
address database current and accurate
Institution
e.g. University of Oxford
New IP address
Delete IP address
PSI Verify IP
Publisher 1
Publisher 2
Publisher 3
Publisher 4
API/unique IP
API/unique IP
API/unique IP
API/unique IP
PSI “Cube”IP Registry
PSI-PROACTIVE/CENTRALIZED VETTED IP ADDRESS VALIDATION(Largely Automated Process)
Institutions Changes Publishers Transactions
70K 1 5.5K 70K
70K 5 5.5K 350K
70K 10 5.5K 700K
Any Questions?
Andrew Pitts: [email protected] White: [email protected]
Keith Abbott: [email protected]