The Invisible Scientist

21
The Invisible Scientist Personal Digital Identity on the Web: Problems + Solutions Duncan Hull The University of Manchester Science Foo Camp 2009 The Googleplex Mountain View, California

description

Personal Digital Identity on the Web: Problems + Solutions

Transcript of The Invisible Scientist

Page 1: The Invisible Scientist

The Invisible Scientist

Personal Digital Identity on the Web:Problems + Solutions

Duncan HullThe University of ManchesterScience Foo Camp 2009The Googleplex Mountain View, California

Page 2: The Invisible Scientist

The Invisible Scientist: Digital Identity

• I am not an identity or security expert but…

• Introduction: Personal Identity historically and currently

• The Problem:

– The way we identify scientists on the Web is inefficient and badly broken

– Which can make much of their work “invisible”

• Some solutions:

– URIs

– OpenIDs

– Contributor-ID (www.crossref.org)

• Conclusions + What might better digital identities allow?

~15 minutes of slides~45 minutes for discussion

Page 3: The Invisible Scientist

Tools for sharing data on the Web < 10 yrs old

All these social software tools are reliant on digital identity of some form

http://tinyurl.com/myscience

These tools are good but…

Page 4: The Invisible Scientist

Unfortunately

• Many biomedical scientists don’t use these tools for serious work

– (if at all)

• Why?

• It’s complicated but…

Page 5: The Invisible Scientist

Scientific publishing has worked this way for centuries

• Publishing the main (perhaps only) way of formally identifying people and their work

• “Publish or Perish”

Page 6: The Invisible Scientist

First published 1687, over 300 years old

Page 7: The Invisible Scientist

But do all these URI’s identify the same person?

1. http://www.cs.bris.ac.uk/~gough/

2. http://en.wikipedia.org/wiki/Julian_Gough

3. http://twitter.com/SUPERFAMILY

4. http://www.juliangough.com/

5. http://www.linkedin.com/pub/julian-gough/b/25b/3b3

6. http://www.citeulike.org/tag/julian-gough

7. http://dblp.uni-trier.de/db/indices/a-tree/g/Gough:Julian.html

8. http://pubmed.gov/?Term=Julian+Gough[author]

9. http://pubmed.gov/?Term=Gough+J[author] 10.http://www.facebook.com/julian.gough

Julian Gough

Identity is different on the Web:We use URI’s toidentify people

Page 8: The Invisible Scientist

Science is increasingly Digital

• Science is increasingly digital

– Not just digital publications in electronic journals…

– wiki edits (e.g. Rfam and Pfam in wikipedia, robert hoffman wikgenes)

– Software development, workflows

– Development of databases and ontologies - “data driven science” + “open data”

– blog posts

• Traditional journal publishing is often inadequate for sharing this kind of data and attributing it to individual people

– See “Defrosting the Digital Libray” in PLoS Computational Biology for details

– Hull et all (2008) http://pubmed.gov/18974831

• No good incentives to make digital contributions (besides traditional publishing)

• “Micro-attribution” - a large number of small contributions go unrewarded

Page 9: The Invisible Scientist

http://tinyurl.com/mistakenid

Page 10: The Invisible Scientist

Misattribution (part 2)

• “Forgotten Password”, “Already Registered”, “Please Login”, “Access Denied” are all recognised as “authors” in Google Scholar

http://tinyurl.com/phantom-user

Page 11: The Invisible Scientist

Digital attribution

Neil Smalheiser and Vetle Torvik

Attribution would seem to be a simple process and yet it represents a

major, unsolved problem for information science.

Author name disambiguationChapter published in Volume 43 (2009) of the Annual Review of Information Science and Technology (ARIST) (edited by B. Cronin) which is available from the publisher Information Today, Inc

http://www.hbs.edu/units/tom/seminars/2007/docs/Author%20Name%20Disambiguation.pdf

Page 12: The Invisible Scientist

Digital identity is currently a mess

• As well as identifying and attributing with URIs, we also need to:

– Attribution: Julian AuthorOf IncrediblyImportantThing

– Authentication: is Julian is who he says he is? Or a fake?

– Authorisation: is Julian authorised to do stuff?

Currently done through combination of username-and-password

http://tinyurl.com/too-many-passwords Simon Willison(The Guardian)

The average user has

[at least]

18 user accounts

and 3.49 passwords”

Page 13: The Invisible Scientist

Digital Identity Really Matters

• Digital Identity is a pre-requisite for

– Attribution …

– Contribution…

– Publication … to be recorded and quantified.

• Important decisions made on digital identity

– Hiring, funding, promotion, collaboration

– Selecting appropriate reviewers for grants and publications

– attributing published data in an increasingly web-based world

• This is the environment which social software / Web 2.0 operates in:

– Reliant on accurate and efficient digital identities

Page 14: The Invisible Scientist

What is myExperiment? http://www.myexperiment.org • Facebook for Scientists?

• Collaborative software for sharing and finding experimental protocols on the web

Page 15: The Invisible Scientist

Who is involved in myExperiment?

• Small team of developers (2-3 full time)

• 1500 users have uploaded 560 workflows, 150 files and 40 packs in 130 groups

Carole Goble

David De Roure

Page 16: The Invisible Scientist
Page 17: The Invisible Scientist

http://openid.net/

myExperiment uses OpenID to tackle Digital Identity and attribution

Page 18: The Invisible Scientist

Open ID is quickly becoming widespread

“42,235 sites are now enabled to accept OpenID logins” sourcehttp://blog.janrain.com/2009/05/relying-party-stats-as-of-may-1-2009.html

Page 19: The Invisible Scientist

But there are usability “issues”

http://einstein.myopenid.com/

[email protected]

mcsquared

OR

Password handled by third partyOpenID provider

+84%

16%

Unless you hide it (e.g. Gmail, wordpress)

Page 20: The Invisible Scientist

Crossref solution: DOIs for people

• Crossref has solved a similar problem with identifying publications across different publishers called “Digital Object Identifiers (DOI)”

– DOI:10.1371/journal.pcbi.1000204

– http://dx.doi.org/10.1371/journal.pcbi.1000204

– They are working on something similar for people

• DOI’s for scientists “Contributor ID”

– Watch this space…

Geoffrey Bilder

Page 21: The Invisible Scientist

Conclusions

• Digital Identity is broken (many biomedical scientists don’t realise)

– Important contributions are not properly attributed

– Misattribution can lead to invisibility

– This can discourage scientists from using the web more

• Fixing digital identities could make science more efficient

– Recognise digital contributions

– Motivate people to make non-publication contributions

• Technical problem mostly solved

• Discussion: The Good, The Bad and The Ugly Things identity might allow…

– Over to you!