VIVO Expert Finder Update

VIVOExpert Finder

UpdateM. Conlon

June 10, 2014

http://vivo.ufl.edu/individual/mconlon

Why an Expert Finder?•Collaboration is the future of scholarship•Finding experts can be difficult•Cost/benefit for development is changing with new tools•We have the data

How: Architecture

The Data• Papers -> Concepts, Papers -> People (Experts)• Concepts on papers – PubMed and other indexing services associate concepts with papers• People on papers as authors• Associate people to concepts on which they have published

• At UF:• 25,878 people with papers• 18,520 concepts• 20,408 papers with concepts

A Detour Through Histograms•We’d like to see the data. We start with simple histograms• Good for learning SPARQL, JSON, Ajax, D3• Good for learning about our data

•Demo Ajax histogram for org types•Demo JSON Histogram for org types• Concept counts and its SPARQL query• Concepts per paper and its SPARQL query

http://mconlon17.github.io/angular/ajax_histogram_types.html

http://mconlon17.github.io/angular/json_histogram.html?data=org_types%2Ejson

http://mconlon17.github.io/angular/json_histogram.html?data=concept_counts%2Ejson

http://mconlon17.github.io/sparql/demo_concept_counts.txt

http://mconlon17.github.io/angular/json_histogram.html?data=concepts_per_paper%2Ejson

http://mconlon17.github.io/sparql/demo_concepts_per_paper.txt

Connect Concepts to People via Papers

Tumor Suppressor Protein p53

William Stratford May, Jr

“Strat” May has written two papers on Tumor Suppressor Protein, p53

http://vivo.ufl.edu/display/n9272944689






So we will need a lot of data• All concepts, all people, all papers, connected = VIVO!• But we need to have some pre-counted JSON data ready for web requests = a concordance• So we use Python to process a SPARQL query concordance result set and create a JSON concordance

Oops. Too big. We need another approachSo we use Python for each concept to find concurrent concepts and authors

Oops. Too big. We need another approachSo we use Python to subset the concordance and select “dense” connections

Now we’re ready

http://mconlon17.github.io/sparql/demo_concordance.txt

http://mconlon17.github.io/angular/concepts.py

http://mconlon17.github.io/angular/subset.py

What – bi-modal force directed graph of concepts and people

The real thing• The concordance subset process has trimmed the concordance globally to only concepts that appear two or more times, co-occurrence of concepts must be at least two. Authors must have published at least two papers regarding the concept.• For a specific concept, find all the co-occurring concepts, and for each concept, find all the people who have published on that concept. Produce a JSON file ready for display. See make_graph_json.py• The resulting JSON is then shown using a force-directed bi-modal graph. See Demo

http://mconlon17.github.io/angular/make_graph_json.py

http://mconlon17.github.io/angular/bimodal.html

Next Steps• Size concept nodes on number of papers• Size author nodes in two rings – inner ring size based on number of papers on the concept. Outer ring on the number of papers total•Weight connects on the number of papers in the connection• Provide hover navigation for concepts to the concept in VIVO, people to their profile in VIVO, links to a display of the papers comprising the connection (with links for each paper to VIVO)• Provide user controls for trimming and positioning the network• Provide a means for accessing all the concordance data without having to generate a new JSON file via Python• Provide recentering – double click on concept or author to redraw the network around the select concept or author• Provide output for identifying people of interest (experts)

Repos• Code• http://github.com/mconlon17/mconlon17.github.io

• Some VIVO Things web site• http://mconlon17.github.io

•Demo• http://mconlon17.github.io/angular/bimodal.html

http://github.com/mconlon17/mconlon17.github.io

http://github.com/mconlon17/mconlon17.github.io

http://mconlon17.github.io/



VIVO Expert Finder Update

Documents

Transcript of VIVO Expert Finder Update