Post on 10-Dec-2015
1
Data-IntensiveResearch: Actions
to make better use of Data for
Research
Malcolm Atkinson & David De Roure
mpa@nesc.ac.uk & dder@ecs.soton.ac.uk
12 January 2010
Reportto
e-Science Forum @ Leedsfrom a fact-finding mission
Mission goal: learn how researchers use data
Acknowledgements: the UK e-Science Directors CIR authors, our teams, the EPSRC, the RCUK, USA office and all of our hosts in the USA,
their good ideas; all the opinions, observations and recommendations are our own.
Outline•Cornucopia of data
•Yet to learn how to use it well
•Hot topic•Research to politics
•Concepts•Datascopes, Intellectual Ramps,Going the last mile
•Co-*
•Alignment
•Digital ecosystem
•Principles
•Recommendations
•Actions
•Survival in the Digital Revolution
3
Data-Intensive Research Events
Bermuda agreement 1996, 97 & 98
SDSS Archive DB 1999
Human Genome 2001
DI Comp. Environm’s2001
BaBar@SLAC 2002
Fort Lauderdale 2003
Hey&Trefethen D.Del.2003
Digital Curation Cen.2004
NSF DataNet call 2007
XLDB series starts 2007
SciDB starts 2008
Yahoo DI workshop 2008
Harnessing data 2009
Beyond data del. 2009
Gov’s use Linked D.2009
NSF CISE DI call 2009
4th Paradigm book 2009
JISC Research DM 2009
e-IRG DMTF report 2009
DIEW Japan 2010
Sir Tim Berners-Leehttp://www.w3.org/DesignIssues/GovData.html
How will
Linked Data
benefit
rese
archers?
How will
Linked Data
benefit
rese
archers?
Datascopes for the naked mind
6
NRAO/AUI/NSF
To reveal evidence in data you could never see before
Datato Information
to Knowledgeto Wisdom
Changed our place in the universe
Datascopes Summary
• Better methods for extracting information from data• better algorithms for
discovery, selection, fusion, distillation, aggregation, presentation
algorithms transformed to run incrementally
• Better strategies for using the algorithms• Better data/metadata and semantics• Better platforms supporting the strategies• Data centres hosting data and computation
• Coping with more complexity, more users & more questions
• Knowledge, questions & datascopes co-evolve
Rally cr
oss-d
isciplin
ary
effortRally cr
oss-d
isciplin
ary
effort
Intellectual ramps
Easy and low risk to startProgress to advanced skillsFor research data usersNo obligationGo as far as you want
Find a service & relax
Dropbox as a Ramp
Local folder synchronised and shared via cloud
Condor job submitted by drag and drop
Ian Cottam
Results appear in Dropbox
Slide from David De Roureramp 1: exploiting familiar tools http://wiki.myexperiment.org/index.php/DropAndCompute
Intuitive interfaces
e-Science Research http://research.nesc.ac.uk/rapid/Slide from Jano van Hemert
Engineering economic ramps
Replace Porta
l Build
ing Cottage
Industry
Replace Porta
l Build
ing Cottage
Industry
ramp 2 - hiding complexity; minimal input
Ramps: Summary
• An easy path to use a data analysis method• An opportunity• Not an obligation• Engage as far as you want
• Use a service for routine tasks• Types of ramp
• in browser - now powerful - can reach the GPU• in familiar tools• support from centres and crowd-sourced
• Strongly linked with education• Removes distracting technical clutter• Rescues educators & students• Ramp & education co-evolve
Boost investm
ent
hereBoost investm
ent
here
Technology & Researchers
15
Co-evolution
Tech. display
Researcherschoose?
Niches?
Fastest atadaptationwins
Actions
1. Workshops on DIR
2. DIR education3. Ideas factory
project launch4. Test best
practice5. Immediate
research challenges
6. DIR facilities pool7. Boost reference
data services8. Foundational
research9. Green DIR10.Coordination
http://tinyurl.com/ye8x4bw
Phase 1
1. Workshops on DIR
2. DIR education3. Ideas factory
project launch4. Test best
practice5. Immediate
research challenges
In Edinburgh 15-19 March 2010
http://tinyurl.com/ye8x4bw
http://www.nesc.ac.uk/esi/events/1047/
Phase 1
1. Workshops on DIR
2. DIR education3. Ideas factory
project launch4. Test best
practice5. Immediate
research challenges
What will your organisation do?
http://tinyurl.com/ye8x4bw
Phase 1
1. Workshops on DIR
2. DIR education3. Ideas factory
project launch4. Test best
practice5. Immediate
research challenges
Immersive + mix => launch projects
http://tinyurl.com/ye8x4bw
Actions Phase 1
1. Workshops on DIR
2. DIR education3. Ideas factory
project launch4. Test best
practice5. Immediate
research challenges
Import, Experiment, Engage & Chooseexisting D-I methods and technology
for pressing existing research
Actions Phase 1
1. Workshops on DIR
2. DIR education3. Ideas factory
project launch4. Test best
practice5. Immediate
research challenges
Which ones? Cross-cutting challenges, methods, facilities and capabilities
Actions phase 1
Actions A1 to A5 will build capacity for larger and more demanding projects, help researchers build teams with effective mixes of skills and experience, and provide performance information for the selection of strategies, technologies and teams for larger commitments to follow.
Phase 2
6. DIR facilities pool7. Boost reference
data services8. Foundational
research9. Green DIR10.Coordination
S/W, H/W & support: what services do researchers need? How much? How soon?
More softw
are; Less
hardwareMore so
ftware; L
ess
hardware
More bandwidth; Fewer
FLOPSMore bandwidth; F
ewer
FLOPS
Phase 2
6. DIR facilities pool7. Boost reference
data services8. Foundational
research9. Green DIR10.Coordination
What can we do to help UK reference-data services? What do is needed as international
agreements?
http://tinyurl.com/ye8x4bw
Phase 2
6. DIR facilities pool7. Boost reference
data services8. Foundational
research9. Green DIR10.Coordination
Computing science + Mathematical & Information sciences + Field experience + long-term
commitment to building foundations of DIR
UKCRC should esp
ouse th
is
causeUKCRC sh
ould espouse
this
cause
Phase 2
6. DIR facilities pool7. Boost reference
data services8. Foundational
research9. Green DIR10.Coordination
What should the UK or your organisation do? Where should it do it?
Phase 2
6. DIR facilities pool7. Boost reference
data services8. Foundational
research9. Green DIR10.Coordination
Framework for interdisciplinarity & pooled effort/resources + UK’s international presence?
This sh
ould be set u
p
immediately
This sh
ould be set u
p
immediately
Applied science + last mile
We seek solutions.We don’t see - dare I
say this? - just scientific papers
anymoreQuoting US Secretary of Energy & Nobel laureate Steven Chu
Survival in the digital-data
revolution depends on speed and
appropriateness of adaptation
Summary
• Much research is data intensive
• More of it will be
• Exploiting the opportunity is urgent for the UK (you/your org.)
• This requires changes• In facility provision
• In research investment
• In research behaviour (incentives)
• In education
• These changes are part of the digital revolution• Understand, engage and ride the wave
• Investing in data-intensive research• Will accelerate research
• Deliver more applicable research
• Provide a better return on investment
And Next ...
• Messages from the panel• To EPSRC Research Facilities SAT• To ESRC & BBSRC & ...• To Carole’s e-Infrastructure group• To e-IRG & ESFRI
• Develop ideas & plan campaign• In your university, institute, research
council• Feed into Spending Reviews
• Meeting at the e-Science Institute• 15 to 19 March 2010• http://www.nesc.ac.uk/esi/events/1047/
24
ADMIRE – Framework 7 ICT 215024
?
Picture compositionbyLuke Humphrybased on prior art by Frans Hals
www.omii.ac.uk
www.admire-project.eu
www.ogsadai.org.uk
www.nesc.ac.uk