Bosc2012 goble

59
If we build it will they come? Prof Carole Goble FREng FBCS CITP [email protected] BOSC, Long Beach, July 14 2012 http://www.mygrid.org.uk

description

Keynote for BOSC (Bioinformatics Open Source Conference) 2012 at Long Beach, CA, USA, 14 July 2012 by Carole Goble

Transcript of Bosc2012 goble

Page 1: Bosc2012 goble

If we build it will they come?

Prof Carole Goble FREng FBCS [email protected]

BOSC, Long Beach, July 14 2012

http://www.mygrid.org.uk

Page 2: Bosc2012 goble

Improving Knowledge Turning, Enabling Reuse and Reproducibility

[Josh Sommer]

Est. 2001

Keep the vision, modify the plan

Page 3: Bosc2012 goble

Computational MethodsScientific workflows. Distributed web/grid/cloud servicesThird party, independent service reuseData pipelines and analytics

Volunteerist Human Computation e-Laboratories - social collaboration and sharing environments for scientific artefacts. Libraries and Catalogues. Asset safe havens, sharing, reuse.

Knowledge Acquisition ToolsSemantic technology, semantic applications, research objects, executable papers.Data/Metadata curation & reuse

OWL

POPULOUS SKOSEdit

LGPL

BSD

Various

Page 4: Bosc2012 goble

The Taverna Suite of Tools

Client User InterfacesGUI WorkbenchWorkflow Repository

Service Catalogue Third Party Tools

Web Portals

Activity and Service Plug-in Manager

Provenance Store

Workflow Server

Open Provenance

Model

Secure Service Access

Workflow Engine

Virtual Machine

Programming and APIs

Command Line

Page 5: Bosc2012 goble

5820 members, 304 groups, 2415 workflows, 604 files and 229 packs (research objects)

Community HavenSharing ResourceSocial Collaboration

http://www.myexperiment.org

http://wiki.myexperiment.org/index.php/Galaxy

Page 6: Bosc2012 goble

Contribute, Find and understand Web Services

Curate, review and comment

Learning resource

Monitor Services Cloud Registry

BioCatalogue: crowd curation of web services

2295 REST and SOAP services, 169 service providers. 674 members, 27 countries

Page 7: Bosc2012 goble

Find, exchange and interlink, preserve, publish data, models, publications, SOPs & analyses.

ISA Compliant

Launch and validate models and analyses:JWS Online

Find experts, colleagues and peers.

Gateway to public tools and resources, e.g. BioModels

SysMO: 16 consortia, 110 institutes, 1600+ assets, 350+ members

livSYSiPS

GerontoSys

Page 8: Bosc2012 goble

Public SEEKhttp://www.seek4science.org

Page 9: Bosc2012 goble

Sharing Platform & Trusted Service

Standards & ContentGovernance & Policy

Preservation &Publication Platforms

Gateway

Software & ToolsOpen source

Knowledge Network Skills & Community Building

Comp Sci Research Platform

Page 10: Bosc2012 goble

Laissez-faire Philosophy• Bottom Up

– Emergent & scruffy (to a degree…)

• Reliant on third party contributions – Non-prescriptive, non-interfering and

flexible– We make no content ourselves….

• Part of a wider ecosystem– Other services, data, tools, platforms,

people…

• Inspired by social environments • Scarred by top-down, dictated,

tech-driven and unused monoliths

Page 11: Bosc2012 goble

Never underestimate how scruffy third

party stuff can be

How often metadata is missing and messy if

left to its own devices…

Liberty through Limitations

People say they want flexibility. They prefer the simplicity of order and will

adapt to adopt.

http

://w

ww

.flic

kr.c

om/p

hoto

s/he

llaoa

klan

d/31

3736

0455

/

Page 12: Bosc2012 goble

Who is they?

• Jobbing Bioinformatician?

• Expert Bioinformatician?

• Sys admin?• Service provider?• Application

developer?• Tool developer?• Biologist?

Page 13: Bosc2012 goble

PharmacogenomicsGWAS

Trypanosomiasis in African Cattle

Systems Biology of Micro-Organisms

Drug Toxicity

(OpenTox Project)

Metagenomics

Physiopathology of the human body Medical Imaging

Genetic differences between breeds of cattle

The Virtual Liver

Who is THEY?

Page 14: Bosc2012 goble

Distributed Groups & Independent Lone rangers

Long tail, Disconnected from data providers and each other, emergent,

Organised, Planned, Strong connections with resource providers and each other.

Independents….Bovine

Trypanosomiasis Consortium

Consortia

Individuals

ResearchGroups

Page 15: Bosc2012 goble

Specialise or Diversify?

• Flexibility and extensibility -> customised Software and Services, Cookie cutter

• Widen adoption• Spread risk, extend

resourcing streams

• Cross development alignment and coordination

• More communities to build, nurture, support and sustain

• Core Drift and Bashing

Helio-Physics

Document Preservation

BioDiversity Astronomy

Social Science Engineering: JPL, NASAFLOSS

Page 16: Bosc2012 goble

BioDiversity Virtual e-Laboratory

Biodiversity Services

WebDaV Data ManagementWebDaV Data Management

BLAST,Hmmer,MrBayes,

Blast, PAML,EMBOSS,…

BLAST,Hmmer,MrBayes,

Blast, PAML,EMBOSS,…

R R

Synonyms Synonyms

Execution environment

Catalogues /Repositories

BioSTIFBioSTIF

Google RefineGoogle Refine CSW

openModelleropenModeller

WPS / WCPSWPS / WCPS

Auth

entic

ation

/ A

utho

risati

onAu

then

ticati

on /

Aut

horis

ation

Ope

nSe

arch

Ope

nSe

arch

Prov

enan

cePr

oven

ance

TavernaWorkbench

TavernaWorkflow Engine

and Server

Grid, Cloud, etc.

Phylogenetic

Taxonomic

Visualisation

Modelling/GeoProcessing

Platf

orm

sPl

atfor

ms

http://www.biovel.eu

Page 17: Bosc2012 goble

Who is We? The ego-system

biologists, bioinformaticians, biodiversity informaticians, astro-informaticians, social scientistsmodellers, software engineers, computer scientists, systems administrators,

resource providers

Page 18: Bosc2012 goble

Methods & PracticeCS Research Production

My World

Science

Page 19: Bosc2012 goble

Research Objects Reproducibility, Integrated Publishing,

Carriers of Research Context

• Citation• Aggregation • Annotation • Provenance• Lifecycle• Preservation • Decay• Sharing• Stereotypical Profiles• Services and APIs• myExperiment 2.0 Encodings: Semantic Web: LOD, VoID,

OAI-ORE, AO/OAC, SIOC, OPM/PROV, Memento….

http://www.wf4ever-project.org

Page 20: Bosc2012 goble

Production

Research

Applications

TrainingPublishing

Community Community

Page 21: Bosc2012 goble

So if we build it will they come?Be useful for something: immediately, continuously, responsivelyBe usable by somebody: user experience, worth the effort, adoption pathSome of the time: as part of a big picture

Under promise and over deliverAcquire Critical Mass

Page 22: Bosc2012 goble

Four things that drive adoption of software or service.

1. Added value– Do something that couldn’t do before or now do faster,

gain competitive advantage, improve productivity, scale up

2. New asset– Get or retain access to something important (data,

method, technique, skills, knowledge)

3. Keep up with the field. A Community.– Future-proof my practice, New skills and capacity,

there is a vibe about it and I’ll be left out

4. Because there is no choice– Business depends on it, its mandated, its de facto

mandated

Page 23: Bosc2012 goble

Seven things that hinder adoption of software or service

1. Not enough added value• It doesn’t solve a problem or not as well or as cheaply

as something else, no content or the right content2. Not fit for take-on. It doesn’t work!

• No: help, guides, documentation, manuals, examples, content, templates, portability, migration / legacy support, easy installation, virtual machines, testing, stability, version control, release cycle, roadmap, sustainability prospect, way of introducing my favourite component/data/environment.

3. No Time or Capacity to take on• To learn, migrate personal legacy

code/data/applications, no pathway/ramp to adoption• Training and special system needs

It Sucks

Page 24: Bosc2012 goble

Software practices

“As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software”

Zeeya Merali , Nature 467, 775-777 (2010) | doi:10.1038/467775a Computational science: ...Error…why scientific programming does not compute.

Page 25: Bosc2012 goble

Software Stewardship

Software sustainabilitySoftware practicesSoftware depositionLong term access to softwareCredit for software Licensing advice

Open licensesReproducible Research Standard, Victoria Stodden, Intl J Comm Law & Policy, 13 2009

“Better Science through Superior Software” – C Titus Brown

Page 26: Bosc2012 goble

Seven things that hinder adoption of software or service

4. Cost – Of disruption, of long-term ownership

5. Exposure to Risk. – First to take-up, Support and sustainability dependencies,

fear of scrutiny, misrepresentation or being scooped,

6. No Community– Support and comfort

7. Changes to work practices– Obligations, unclear or unenforced reciprocity protocols.

It’s too costly

Page 27: Bosc2012 goble

• It sucks but it’s the only thing around

• It’s ace but it’s one of many, too late in the game and not enough to switch

• Tipping point is likely not technical

Betamax vs VHS

Page 28: Bosc2012 goble

Bonus Hinder Never heard of it.

We’ve built it but we haven’t told anyone.

• Make noise…physically and virtually• Customer and Contributor Relationship Building• Self-supporting communities, multi-level marketing

• Highly Resource Intensive

Page 29: Bosc2012 goble

Adoption Intentions Be careful what you wish for

• Incidental– “I built it for myself, and stuck it out there”

• Familial– “I built it for people just like me”

• Fundamental– “I built it for others, many who are not like me”

Page 30: Bosc2012 goble

Open Innovation: Development and Contentyou are not alone. you can’t do it all alone

motivate & enable others to fill gaps “App Store Style”software, services, content, examples….

• Really Interoperate. Don’t tweak.• Be Simple and Standard.• Be Helpful. Be Set up. Be

reusable. Be Smart Galaxy+Taverna/myExperiment

• Others will develop on top of you. But don’t assume they will re-contribute or tell you.

• It’s much harder than you think.• It’s unequal.

Family

Friends

Acquaintances

Strangers

Page 31: Bosc2012 goble

Family Friends Acquaintances StrangersMoore's technology adoption curve

Ladder Model of OSS Adoption (adapted from Carbone P., Value Derived from Open Source is a Function of Maturity Levels)

[FLOSS@Sycracuse]

Page 32: Bosc2012 goble

"it's better, initially, to make a small number of users really love you than a

large number kind of like you" Paul Buchheit

paulbuchheit.blogspot.com

Page 33: Bosc2012 goble

What’s in it for the PAL?– Long tail: Money, kudos,

special support, special resources, skills, reputation building, influence, stuff they can’t do alone, CV building

– Consortia: co-funded• Who is a PAL?

– Post-docs, Post-grads, Administrators, Developers

– PI: protector/champion• PAL handlers

– Customer Relationship Manager, Nanny and Mediator, Scientist

PALS: Building FriendshipsIntelligence, Guidance, Advocacy, Evangelism, Market Research

Page 34: Bosc2012 goble

Do not under-estimate…

The power of the sprint / *-athon / fest / drinking

The power of a whizzy interface. Even for plumbing.

The importance of supporting and propagating best practice

Page 35: Bosc2012 goble

Participatory, EmbeddedDesign-Build-Run-Manage is Good

Act LocalThink Global

Reality Check

The Bigger Picture

Eat your own Dog Food

Page 36: Bosc2012 goble

Participatory DesignWork Together on a Real Problem

Project PIsData controlOwn databasesJust enough

exchange.Visibility limitationsProject dependence

PALsSpreadsheets.Yellow Pages.SOPsUnderstanding

standardsCurating.Examples.Safe HavenProject

independence

3 Years later 15/16 consortia abandoned their own systems and

went with the SEEK system.

FundersData sharing

Data standards

A database

Long term preservation

Page 37: Bosc2012 goble

If you build it will they come and contribute?

Page 38: Bosc2012 goble

ControlledClosedAccess

Participation

Lone scholars

Private Groups

Trusted Collaborators

Public scientists

Citizens

[based on an idea by Liz Lyon]

Open

Cooperation? Coordination? Collaboration? Integration? Evolution and entropy models

Page 39: Bosc2012 goble

[Andrew Su]

Critical mass spiral: 90:9:1

Driven by needs of and benefits to the scientist, rather than top down policies.

Content tipping point

Page 40: Bosc2012 goble

Trust, Fame and Blame: Reciprocity, Competition, Contribution and Use

Victoria Stodden, The Scientific Method in Practice: Reproducibility in the Computational Sciences Feb 9, 2010 MIT Sloan Research Paper No. 4773-10, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1550193

Nature 461, 145 (10 September 2009)

• Scooping, Scrutiny and Misinterpretation• Curation Cost• Poor quality• Reputation / Asset Economics• Public Peer Pressure

Reciprocity Sucks• Flirting• Hugging• Controlled Sharing • Voyerism• Poor feedback / credit

Page 41: Bosc2012 goble

CarrotsHarness Competitiveness

Pride• Reputation: Cult, Credit & Attribution for all

Protection• Just enough Sharing, Licensing & Liability • Quality, Peer review, Metadata

Preservation• Safe havens and Sunsets (project churn)

Publishing / Release• Citability, Supporting Exchange

Productivity• Availability of assets, help, capability,

ramps

Page 42: Bosc2012 goble

http://www.rightfield.org.uk

Adoption Ramps

Instrument familiar, widely-used tools

Spreadsheets and Email

Page 43: Bosc2012 goble

Adoption Stealth• Data at home promise with

automated harvesting• Sharing creep, Incremental

metadata, Low obligations• URL upload in BioCatalogue• Web Service “come as you

are” take-on in Taverna

• Metadata prompting, Right tools, right time, right place

• Service collections & Packaged services

Page 44: Bosc2012 goble

Be vigilant• PAL burn-out and

over familiarity• Unadjusted over-

user accommodation• Drifting apart and not

keeping it fresh• Step back, observe

and adapt/intervene!• So relieved to get a

community….• Instrument adoption

and observation

Participatory Development is a mutual long term relationshipNot flirty speed dating, One night stand, Crush, Me Me Me

Page 45: Bosc2012 goble

Urgent-Important • Technical bog down,

operational burn-out• Little things that are

important but don’t seem that urgent…

• Dominant projects• Not-software content• It all takes way longer

than you think• Simplicity drift

Participatory Development is a mutual long term relationshipNot flirty speed dating, One night stand, Crush, Me Me Me

Page 46: Bosc2012 goble

Beware Version 2 Syndrome!Version 2 Syndrome

Page 47: Bosc2012 goble

The Jam-based Adoption Model

aka Added Value

Value Proposition Return On Investment

http://delicious-cooks.com/photos/raspberry-jam/04/

Page 48: Bosc2012 goble

What’s is the Special Jam? What is your Jam Value Chain and for Who?

What:

SysMO: safe haven, spreadsheet tooling, linking SOPs, models and data, examples

Taverna: power, adaptability and myExperiment

Who:

Focused on contributors and experts

Provider-consumer balance

Functionality-Simplicity Syndrome

Changing Who - Challenging baked-ins

Page 49: Bosc2012 goble

Jam today and more, better Jam tomorrow

Just Enough Jam, Just in Time not Just in Case

* Feature Creep Conundrum * Big Picture Paradox

* Core vs Specifics Syndrome * Content Decay Dilemma

* Working to working Stability Stress

Page 50: Bosc2012 goble

Customised Specific Jam beats Generic

* Flexibility/Functionality – Simplicity Conundrum

* Diversification Dilemma

Page 51: Bosc2012 goble

Where is my Jam? Jam for All

• What are WE (platform providers, Software builders, Community builders and Service providers) getting out if it?

• Need credit and interest too.• Altmetrics

http://james.howison.name/pubs/HowisonHerbsleb2011SciSoftIncentives.pdf

Howison and Herbsleb, Scientific Software Production: Incentives and Collaboration, CSCW 2011, March 19–23, 2011, Hangzhou, China

http

://ww

w.g

ettyim

ag

es.co

.uk/d

eta

il/ph

oto

/em

pty-ja

m-ja

r-roya

lty-free

-ima

ge

/13

69

76

19

8

Page 52: Bosc2012 goble

Jam forever

They came. Have the evidence. Have a plan. Did you wish for this? Do you want it?

Fragile Flux• Content, services, bits, communities

Funding Plan• Novelty over sustainability, • Research-Production Falsehoods• Wave invention, Political lobbying

Securing the community• Leadership & Foundations

Business model???Software is Free like Puppies Are Free

Page 53: Bosc2012 goble

Jam not forever

• Acquire

• Retain

• Widen – More/Different

• Reposition– Different/New Stage

• Changing Community is Challenging… [Daron Green]

Page 54: Bosc2012 goble

The Social and the Technical

are Inseparable

Adoption is a Merry-Go-Round

Page 55: Bosc2012 goble

You know they came when……you were useful and usable to someone some of the time, but they might not tell you

… people ask you to join their consortia or use it … they gave up their own home grown stuff for yours

… someone you don’t know uses it and tells you all about your own stuff. … someone publishes papers about it. Without citing you.… someone else claims credit.… people you don’t know start bitching about it.

… its just expected to be there and you are kind of expected to be there too.…your Head of School complains you don’t do enough CS research because you are doing too much Software Engineering and Support.

Page 56: Bosc2012 goble

James Howison Heather Piwowar

Christine Borgman Nosh Contractor

Victoria Stodden Janet Vertesi

Jay Liebowitz Robert Kraut

Acknowledgements (1)

Page 57: Bosc2012 goble

Acknowledgements (2)

• The myGrid family, friends and contributors• But especially: Katy Wolstencroft, David Withers, Marco

Roos, Alan Williams, Jits Bhagat, Stuart Owen, Stian Soiland-Reyes, Shoab Sufi, Robert Stevens, Paul Fisher, Peter Li, Ian Dunlop, Finn Bacall, Mannie Tags, Niall Beard, Rob Haines, Christian Brenninkmeijer, Alasdair Gray, Tim Clark, Pinar Alper, Paolo Missier, Khalid Belhajjame, Duncan Hull, Sean Bechhofer, david De Roure, Don Cruickshank, Wolfgang Mueller, Olga Krebs, Franco Du Preez, Quyen Nguyen, Jacky Snoep.

• The members of Wf4ever, SysMO, BioVel, HELIO, SCAPE, OMII, SSI, NeiSS, Obesity e-Lab and anyone else I forgot

Page 58: Bosc2012 goble

Further Information• myGrid

– http://www.mygrid.org.uk• Taverna

– http://www.taverna.org.uk• myExperiment

– http://www.myexperiment.org• BioCatalogue

– http://www.biocatalogue.org• SysMO-SEEK

– http://www.sysmo-db.org• MethodBox

– http://www.methodbox.org.uk• Rightfield

– http://www.rightfield.org.uk• Wf4ever

– http://www.wf4ever-project.org• BioVeL

– http://www.biovel.eu• Software Sustainability Institute

– http://www.software.ac.uk• Software Carpentry

– http://software-carpentry.org/

Page 59: Bosc2012 goble

Keep your Friends Close

EmbedFavours will Favour you

Know your Users

Anticipate Change

SkepticChampions

Coalface users

Patrons

End Users

Developers

Service Providers

System Administrators

Keep Sight of the Bigger Picture

Friends and Family

Fit in

Jam TodayJam Tomorrow

Just EnoughJust in Time

Act LocalThink Global

Enable Users to Add Value

Design for Network Effects

SUMMARY(De Roure and Goble, IEEE Software 2009)