Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour...

28
| 0 James Toon | Pure Customer Consultant | Elsevier Email: [email protected] Dr. Stephen Peuchen | Head of the UMCG Research Office, University Medical Center Groningen, University of Groningen Email: [email protected] Funding Discovery in PURE - A Proof of Concept - PoC RIGHT ON October 11, 2017

Transcript of Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour...

Page 1: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 0

James Toon | Pure Customer Consultant | Elsevier Email: [email protected]

Dr. Stephen Peuchen | Head of the UMCG Research Office, University Medical Center Groningen, University of GroningenEmail: [email protected]

Funding Discovery in PURE - A Proof of Concept -

PoC RIGHT ON

October 11, 2017

Page 2: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 1

Overview

• Introduction + purpose

• Context for the trial (i.e. Pure components)

• PoC Design, results and conclusions

• Future recommendations

• Acknowledgements

Page 3: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 2

Purpose of trial

“The goal of the proof of concept is to establish a sufficiently

accurate match between the profile of a principal investigator

and upcoming Funding Opportunities.”

Page 4: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 3

Background context

Section 1

Page 5: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 4

1 - The Pure Funding Discovery Module

Goal - To create a module that supports researchers and administrators in their

quest to find funding opportunities.

Page 6: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 5

1 - The Pure Funding Discovery Module

• Released 4.19

• Companion product to award management module

• Allows setup of multiple funding profiles for each researcher

• Researchers can browse results, bookmark, share or reject.

• Researchers able to start application via ‘one click process’

• Administrators can assist researchers with setting up and tuning profiles

as required

• Administrators can view profile activity – track module usage

Page 7: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 6

2 - Fingerprint Engine - ‘Structured representation of

unstructured data’

“mines the text of scientific documents – publication

abstracts, funding announcements and awards, project

summaries, patents, proposals/applications, and other

sources – to create an index of weighted terms which defines

the text, known as a Fingerprint™ visualization.”

Page 8: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 7

How does the Elsevier Fingerprint Engine work?

Page 9: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 8

Fingerprinting in Pure

• Fingerprints produced on research publications, award, equipment

- Publication only in context of PoC.

- English language, title & Abstract required (Title only not supported)

- WoS UT reference *cannot* be primary source ID

• Fingerprint profile produced via cron job(s) and interaction with FPE

- Content fingerprinting

- Person and Organisation aggregation

- Project aggregation

• Fingerprint settings to allow simple config for

- Period (i.e. years coverage)

- Thresholds (i.e. ranking minimum, concept max.)

Page 10: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 9

3 - Scival Funding (soon to be Funding Institutional)

• Information about grant opportunities and award recipients

• Coverage inc. Australia, Canada, European Commission, India,

Ireland, New Zealand, Singapore, South Africa, United Kingdom, and

United States.

• Service indexes funding content from over 3,500 international

sponsors, providing c24,000 active opportunities (4th Oct)

• Comprehensive, accurate, and current grant data is captured directly

from the sponsor websites.

• Content improvement program underway

Page 11: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 10

The data is curated and

further enriched with:

• research discipline

classification

• disambiguated

researcher names for

awarded grants

• disambiguated affiliation

names for awarded grants

Statistics on funding opportunities, funder profiles and awarded grants

As of September 2017 Active opportunities

(currently active)

Awarded grants

(2009 – now)

Top funder types

Government 6,000 3,000,000Foundations, societies & charities 11,000 1,000,000

Academic 5,000 83,000

Top funding categories

Research grants 7,500 1,707,000Academic and training grants 7,000 415,000

Prizes 5,000 32,000

Top disciplines

Social sciences 9,500 2,900,000Medicine 7,500 830,000

Arts & humanities 4,000 450,000Engineering 3,000 300,000

Biochemistry 2,500 135,000

Top funding countries

USA 16,000 3,400,000UK 2,500 300,000

Australia 1,300 190,000Canada 1,100 410,000

Funding Content

Content about funders, funding opportunities they offer and grants they awarded

Page 12: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 11

All UK Funders: 957

Academic Institutions

29%

Charities and Non-Profits

24%

Professional Societies

19%

Foundations13%

Corporations5%

State Government

4%

Local Government

4%

International Organizations

2%

UK Funders with active opportunities: 273

0 100 200 300 400

Academic Institutions

Charities and Non-Profits

Professional Societies

Foundations

Corporations

State Government

Local Government

International Organizations

All Funders

Improvement programme - Content ‘white space analysis’ by region and

assessment of data sources

Page 13: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 12

Setup of trial data

• Set up sandbox for Groningen

• Extracted publications data from Scopus for named PI

• Created persons/organisations within Pure

• Created publications XML from Scopus to bulk upload

• Set up fingerprinting on output

• Aggregated fingerprints to person/organisations

Assumptions

• Only used data contained within

Scopus for PI

• No change from default

settings/thresholds for fingerprints

• Omitted additional signals from

award/project data (considered out

of scope)

Approach to setting up PoC (Pure)

Page 14: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 13

PoC Design and Results

Section 2

Dr. Stephen Peuchen | Head of the UMCG Research Office, Disclosure: no personal or financial relationship with Elsevier B.V.

CURRICULUM VITAE

• B.S. Chemistry, USA

• Ph.D. Clinical Chemistry, MCV-VCU, VA, USA

• Postdoctoral Researcher, UCL, UK

• Head of UMCG Research Office, UMCG-NL

• Lecturer/tutor Master Programme Transfusion Medicine UMCG / UoG / Sanquin, The Netherlands

Profiles

Google: Peuchen RUG

LinkedIn: Stephen Peuchen

Coincidence is logical. Johan Cruyff

Funding Discovery in PURE - A Proof of Concept - PoC RIGHT ON

Page 15: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 14

Institutional Policy Towards Grants

- Pick and Match work

inefficient / labour

intensive.; i.e. comms of

relevant Funding Opps.

- Bespoke solution was

sought in Q1 2016 in

collaboration with

Idox plc. Competition has

Funding DB in

combination with Web

Crawlers (since 2011).

GOAL: Increase Funding Footprint (magnitude and circumference)

Page 16: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 15

Funding Disc. Module

Targeted Funding (S2) Applications

(S3,S4)

Awards (S5) - MTR

Projects (S1 & S6)

Award Management

(AMM) S1 thru S5 –

approvals – sign-offs

Grants Life

Cycle

S1: Generate

Idea

S2: Find Funding

S3: Develop Proposal

S4: Submit

Proposal

S5: Manage Award

S6: Share

Research

Grants Life Cycle & PURE

S0: Establishing Topics (lobby; out of scope)

Page 17: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 16

AI /Fingerprint Engine

SciVal Funding

Matched output in PURE

List FundingOpps per PI

PURE Pubs

(Semantic) Profile

Fine Tuning

REJECT

End

ACCEPT

Superimpose Filters

PoC FlowchartSample Selection:

- Purposive

- N=4; 2 Clinicians, 2

Biologists/Biochem.

- Selection Bias?

Domains:

- Neurobiology

- Psychiatry

- Paediatrics

- Cardiology

Page 18: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 17

Semantic Profile Tuning (1/2)

Page 19: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 18

Semantic Profile Tuning (2/2)

PI PI1

(No. S/hits)

PI2

(No. S/hits)

PI3

(No. S/hits)

PI4

(No.S/hits)

Def. Profile 87/757 74/474 191/482 66/2385

Adj. Profile 51/235 26/635 126/685 54/1138

Input (P) 24 193 211 371

Profile Issues:

- Paradoxical increase in No. of hits with decrease in No. of concepts.

- Operator adjustment time approx. 30 mins max. per S-profile

- Categorical adjustment only with what has been matched; no booleans as of yet or

choice from MeSH e.g.

- Significant No. of False Positive Concepts (unexplained; FPE/PURE mapping?

- N=23,724 hits

w/o profile

selected.

- Results shown

with no further

selection (filters

etc.)

- Further

refinements of

semantic profiles

made in 2nd

iteration prior to

testing.

Page 20: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 19

Unified Test Conditions & Scoring Sheet

• Refined Semantic Profile

• Funding Opportunity Types = Research Grants and Fellowships

• FILTER: Award Ceiling from 50.000 (USD, EUR, GBP etc.)

• FILTER: Deadline = + 6 months

• Filters result in substantial sequential decrease in number of matches.

Test Score: ACCEPT, or REJECT with REASON CODE.

Sundries: e.g. observations about skew or false negatives

Page 21: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 20

PI-X: 36 matches of which 18 (50%) accepted and 50% rejected.

Reject categories / reason codes:

86% off-topic = plant or military applications.

14% On target but concerning the direct application side of ion channels and

liposomes, i.e. wrong angle

Other issues:

Skew towards US Grants (94%)

False negatives: H2020 / National Grants (NL) both career development

grants and general consortium grants.

Test Results (1/2)

Page 22: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 21

• PI-Y: 246 matches of which 200 rejected (81%) and 12 accepted (6%) and

34 matches for follow-up (14%).

• PI-Z: 198 matches of which 8 accepted (4%). No reason codes given w.r.t.

rejection – time constraints.

• PI-T: 103 matches assessed of which 14 accepted (14%), 76 rejected

(74%), 11 inelligible on follow-up (11%), 2 uncertain.

Test Results (2/2)

Page 23: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 22

Limitations• Small sample size (n=4)

• Selection bias 1: mid-career scientists (issue of cold start or drift not assessed).

• Influence of ‘projects’ on semantic profile not assessed (only Scopus publications used).

• Selection bias 2: limited to the Life Science and Health domain; thus leaning mostly on MeSH thesaurus.

• Measurement-bias: eligibility insufficiently examined due to incomplete source data and time constraints over-estimate of accepted matches.

• Cross sectional analysis at discrete time points concurrent with system changes system upgrades and downtime; no steady-state for testing

• Funding content heavily skewed towards US Grants; limiting its current use in the EU and other continents.

Page 24: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 23

Conclusions and Next Steps

• The eco-system is viable and sufficiently specific.

• Number of accepted hits under test conditions in manageable volume.

• Selection of Grants based on continent (e.g. EU vs North America) is

essential.

• Development /incorporation of a Dutch Corpus i.e. languages other than

English essential.

• In its current state the FD module could serve as an add-on to the

currently surveyed Funding Opps from multiple sources (public and

private) but not as a replacement.

• Upscaling to a larger test (n=50 PIs?) after system improvements seems

‘logical’.

Page 25: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 24

• Prof. Folkert Kuipers (Dept Paediatrics), UMCG

• Prof. Armagan Kocer (Dept. Neurosci), UMCG

• Prof. Pim van der Harst (Cardiology Dept.), UMCG

• Prof. Robert Schoevers (Psychiatry), UMCG

• Dr. Marijke Schreurs (Dept. Paediatrics), UMCG

• Dr. Irene Mateo (Cardiology Dept.), UMCG

• Dr. Anja Smykowski, (Research BV), UMCG

• University of Groningen: Dr. Jules van Rooij, R&V, Liaison RUG,

Medical Library and University Library Staff.

Acknowledgements

Page 26: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 25

Future Developments

Section 3

Funding Discovery in PURE - A Proof of Concept - PoC RIGHT ON

Page 27: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 26

1. Follow-up or bypass of current coverage limitations with manual upload of funding opps covering National and some EU grants to every default profile?

2. Custom filters and arithmatic on semantic concepts.

3. Easy clustering of PI’s according to PURE Research Organisation Tree, Career stage, and adjusting semantic profiles within the group.

4. Improvements in Pure UI - Reporting/analytics for funding opportunities – number of views/shares, rejection reasons etc. Improvements to semantic concept ‘tuning’ functionality for PI/Administrative staff

5. Moving beyond content filtering (e.g. similar opps) to user-input driven collaborative filtering. So how do we rate the product? (relevance? / success rate? etc), Who is rating the product? And what do they (the peers) recommend?

6. Trending Grant Topics, statistical views of its geographical use and its users and matches with PIs and clusters of PIs. (perhaps related to Topic Prominence work now in Scival)

Future Developments

Page 28: Funding Discovery in PURE - A Proof of Concept - …...-Pick and Match work inefficient / labour intensive.; i.e. comms of relevant Funding Opps.-Bespoke solution was sought in Q1

| 27

www.elsevier.com/research-intelligence

Thank you! James and Stephen

Funding Discovery in PURE - A Proof of Concept - PoC RIGHT ON