The future of scientific information & communication

Post on 10-May-2015

2.130 views 2 download

Tags:

description

Our access to scientific information has changed in ways that were hardly imagined even by the early pioneers of the internet. The immense quantities of data and the array of tools available to search and analyze online content continues to expand while the pace of change does not appear to be slowing. While scientists now have access to the enormous capacities and capability of the internet the vast majority of scientific communication continues to be through peer-reviewed scientific journals. The measure of a scientist’s contribution is primarily represented by their publication profile and the citations to their published works and offers an incomplete view of their activities. However, we are at the beginning of a new revolution where the ability to communicate offers the opportunity to embrace new forms of publishing and where scientific participation and influence will be measured in new ways. This presentation will provide an overview of our new generation of “openness” in which open source, open standards, open access and open data are proliferating. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community and facilitated collaboration and ultimately accelerate scientific progress.

Transcript of The future of scientific information & communication

The future of scientific information & communication

Antony Williams

SUNY Potsdam, April 12th 2013

How does the internet influence you?• How many of you visit the internet/check your

email less than a dozen times per day?• Where do you go for fact-checking?• How many on Facebook? How many on Twitter?• You know you have an online profile right?• Scientists…how many of you are working on

building a scientific profile online?• How many of you online now???

Me….and my vanity!

Searching Antony Williams

Searching ChemConnector…

http://re.vu/AntonyWilliams

Wikipediahttp://en.wikipedia.org/wiki/Antony_John_Williams

LinkedInhttp://www.linkedin.com/in/AntonyWilliams

Academia.edu

And Mendeleyhttp://www.mendeley.com/profiles/antony-williams/

And My Co-author Graph

And Videos

–YouTube–SciVee–Vimeo–Slideshare

I am Quantified…

ResearchGate

Google Scholar Citations

LinkedIn

AltMetrics

Usage, Citations, Social Media…

Scientists are “Quantified”• Stats are gathered and analyzed • Employers can find them, tenure will depend

on them, funding are affected by them• Scientists Impact Factors, H-index and many

other variants• Science is both competitive and collaborative

If it was not just about me…

• Together we might:– build an encyclopedia– …and rate restaurants– …share book reviews – …and movie reviews– …and reviews of service providers– …organize sit-ins and social action– …and more data might just be Open

If it was not just about me…• Together we might:

– build an encyclopedia– …and rate restaurants– …provide book reviews to each other– …or movie reviews– …or reviews of service providers– …organize sit-ins and social action– …and more data might just be Open– …more scientists might collaborate and share

It is so difficult to navigate…

What’s the structure?What’s the structure?

Are they in our file?

Are they in our file?

What’s similar?What’s similar?

What’s the target?

What’s the target?Pharmacology

data?Pharmacology

data?

Known Pathways?

Known Pathways?

Working On Now?

Working On Now?Connections to

disease?Connections to

disease?

Expressed in right cell type?

Expressed in right cell type?

Competitors?Competitors?

IP?IP?

Let’s Change the World

• Let’s map together all historical chemistry data and build systems to integrate new data

• Heck, let’s integrate chemistry and biology data and add in disease data too

• Lets model the data and see if we can extract new relationships – quantitative and qualitative

• Let’s make it all available on the web

That’s a BIG Request

What About Something Smaller?

• We’re going to map the world• We’re going to take photos of as many places

as we can and link them together• We’ll let people annotate and curate the map• Then let’s make it available free on the web• We’ll make it available for decision making • Put it on Mobile Devices, Give it Away

Where am I from?

Wikipedia

Wikipedia

I care…I want to contribute…

The Power of Contribution

How do you spell Afonwen?

Whoa…

• So the world can be mapped…• We can enter a 3D environment within the map• We can add annotations• We can use the data, we can reference it, we

can extract it, we can make decisions with it• And we can do it on our lap, in our hands• Let’s crowdsource chemistry and biology!!!

Science is being Crowdsourced

• Crowdsourcing science is happening…– Contribution of data

• Our data, About us• Our data, generated in labs• Open Data, data validation and curation

– Contribution of software• Open Source, Open Standards

– Contribution of funding

If we can map the planet…

• …then we should map the Galaxy!

GalaxyZoo

Various ways to contribute

Where Am I From?

Where Am I From?

What can be done with Big Data

Patients Like Me

Patients Like Me

I am Chemist

Back to this….

• Let’s map together all historical chemistry data and build systems to integrate new data

• Heck, let’s integrate chemistry and biology data and add in disease data too

• Lets model the data and see if we can extract new relationships – quantitative and qualitative

• Let’s make it all available on the web

How can I contribute to chemistry?

• Publish data, share data, validate and curate data• Publish chemicals, syntheses and data• “Publish” – Papers, Blogs, Reports, Tweets,

Presentations, Videos • Contribute to Wikipedia • Participate in chemistry communities• Contribute to the Big Data

• I’ve performed a few dozen chemical syntheses• I’ve run thousands of analytical spectra• I’ve generated thousands of NMR assignments• I’ve probably published <5% of all work • Most of it has been lost• But things can be different today….

About Me…as a Chemist

Blog• Opinions, procedures, observations, experiences

Presentations

Presentations, Videos, Report, Pre-publications

YouTube/Vimeo/SciVee

• Presentations are easy to turn into movies and publish to these services

• Literally “gives you a voice”

Data as a Publication

Data as a Publication?

http://figshare.com/articles/Prevalence_and_use_of_Twitter_among_scholars/104629

Contributing to the “Big Data” Maps

My Data Contributions…

Data & Curations to ChemSpider

• The Royal Society of Chemistry free database• 28.5 million chemicals and growing daily• Software interfaces to integrate to• Amenable to community contribution

– Deposit structures, property data, spectral data– Data annotation, validation and curation

• 3-year Innovative Medicines Initiative project

• Integrating chemistry and biology data using semantic web technologies

• Open source code, open data and open standards

• Academics, Pharma companies, Publishers….

The Publishers!?

(Some) Publishers are Changing?

• Data cannot be copyrighted and we have lots• Scientists contribute data in document form • Most publishers are open to Open Access

• Scientific publications are built on data so what can be done to release the data? Much data is not published? Many scientists will not share…

Publications - a summary of work

• Scientific publications are a summary of work– Is all work reported?– How much science is lost to pruning?– What of value sits in notebooks and is lost?

• How much data is lost?– How many compounds never reported?– How many syntheses fail or succeed?– How many characterization measurements?

Community Repository for Data• Funding agencies encourage sharing of data• Increasing availability of “Open Data”• Institutional repositories have no specific domain

support • Why not develop a community repository for

chemistry data – private, public, embargoed?• Provides data to develop models/algorithms?

Chemical Database Service• National Chemical Database

Service for UK Academics

• Integrating Commercial Databases and Services

• Chemicals, analytical data, prediction algorithms

• Development of data repository

Model Building with Community Data

• Community data as a basis of model building– Consume data from available databases, community

data, new publications and build predictive algorithms for the community

– How many algorithms are reported and lost? How much repeat work is done in the domain of algorithmic development?

Pulling Data from our Archive

• Our contribution to the world of chemistry data• DERA – digitally enabling the RSC archive

– Text mining• Find chemicals, reactions, analytical data, properties

– Algorithmic checking• Validate algorithmically what we can - robots

– “Web 2.0 interfaces” for curating and validating

What if we could capture it all?Digitally Enhancing the RSC Archive

Human Validation and Curation

Web 2.0 Contribution

• We have been contributing to the web for a along time already – but how much in chemistry?

• A few blogs, an increasing amount of tweeting but what about data sharing in chemistry?

The Old Way of Challenging

Challenging Science…

Collaboration towards completion

Detailed constructive dialog

Oxidation by Sodium Hydride?

The Blogosphere Analyzes…

The Blogosphere Analyzes…

How much is in the archives?

Open Notebook Science Analysis

Oxidation by Sodium Hydride?

What is Hexacyclinol?

The Blogosphere “Discusses”…

What is real, what is fake?

http://www.youtube.com/watch?v=hMpAoC-h5SA

Chemistry is Dangerous!

http://tinyurl.com/cl2awnj

Chemistry is Dangerous

• Florida DJs May Face Felony for April Fools' Water Joke Worse Than Rubio's

“… told their listeners that "dihydrogen monoxide" was coming out of the taps

throughout the Fort Myers area.”

www.dhmo.org

How do you recognize good vs bad?

Is this real?

Junk vs Real

“We then established a collaboration with professor Sum Ting Wong, a fugitive from the North Korean University Hu Yu Hai Ding”

“..identified as the new protein Wai So Dim”

What is real, what is fake?

Helping to change science

• Participation and contribution • Immediacy of action• Platforms for contribution• Openness…whatever that is

Openness – Carries Licensing

• Openness may be hard..

• Open Access flavors• Open Source licenses• Open Data licenses• Open Notebook Science

Getting Called Out in Public…Rules for Licensing Data

Challenged in the Twittersphere

Annotating Articles Today…

Attribution to me…

Remember Quantifying Scientists• Scientists Impact Factors. Science is both

competitive and collaborative• Can we measure ALL contributions to science?

Article-Level metrics are here

The Alt-Metrics Manifesto• http://altmetrics.org/manifesto/

ImpactStory

ImpactStory

Scientists AltMetrics

Detailed Usage Statistics

Usage, Citations, Social Media, Etc

• Persistent unique digital identifier • Integrates to workflows such as manuscript

and grant submission• Supports automated linkages with your

professional activities

Enabled by

Micropublishing How much data is lost?

• How many reactions never get published?• How much data could be shared?• How many properties are measured and lost?• What stands in the way of sharing?

– Is it technology? – Permissions? “The Boss”, Licensing?

Micropublishing Syntheses

ChemSpider SyntheticPages

What is real, what is fake?

Profile

Interactive Data

Rewards and Recognition

• The badgesonomy culture of recognition is growing.

• Badges are commonplace– FourSquare – Klout

Rewards and Recognition

• Rewards and Recognition starting with CSSP then expands to other platforms

• Including paths to expose such recognition on AltMetrics platforms – in discussion…

Impact by Data Set onData

IC50 Measurements for 62 substituted benzoxazolesChemSpider Data Repository: DOI: 10.1356/CSID784.4

What Does the Future Hold?

The Data Deluge Will Not Go Away

The Linked Network Will Grow

We DON’T want this world..

Thanks Martin!

We’re not there yetYou can’t get there from here

Thank you

Email: williamsa@rsc.org Twitter: ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams