Comb-e-Chem Jeremy Frey Sept 2003 From e-Science to Publication@Source Jeremy Frey School of...
-
Upload
kevin-cunningham -
Category
Documents
-
view
212 -
download
0
Transcript of Comb-e-Chem Jeremy Frey Sept 2003 From e-Science to Publication@Source Jeremy Frey School of...
Jeremy FreyJeremy FreySept 2003Sept 2003
Comb-e-Chem
From From ee-Science to -Science to Publication@SourcePublication@Source
Jeremy FreyJeremy FreySchool of Chemistry School of Chemistry
University of Southampton, UKUniversity of Southampton, UK
X-ray single Mol
STM
Ram
an
Ocean Monolayer
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
ee-Science-Science• ‘e-Science is about global collaboration in key
areas of science, and the next generation of infrastructure that will enable it.’
• ‘e-Science will change the dynamic of the way science is undertaken.’
John Taylor, DG of UK OST • ‘[The Grid] intends to make access to
computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.
Tony Blair, 2002
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
The Collaboratory ConceptThe Collaboratory Concept
• In 1989, William Wulf, then with the U.S. In 1989, William Wulf, then with the U.S. National Science Foundation, defined a National Science Foundation, defined a collaboratorycollaboratory as as
"a center without walls, in which the nation's "a center without walls, in which the nation's
researchers can perform their research without researchers can perform their research without regard to geographical location, interacting with regard to geographical location, interacting with colleagues, accessing instrumentation, sharing colleagues, accessing instrumentation, sharing data and computational resources, and accessing data and computational resources, and accessing information in digital libraries."information in digital libraries."
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
The Comb-e-Chem Project• The exponential world of
Combinatorial Synthesis and High throughput analysis meets the exponentially growing power of computing
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
•Bristol •Chemistry
•ECS
•Stats
•Chemistry
•Combi •Centre
•Southampton
•NCS
•IUPAC•RSC
•IBM
•CCDC
•Pfizer
•IT •Innovation
•Comb-e-Chem Partners
•GSK
•AZ
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
The CombThe Comb-e- -e- Chem VisionChem Vision
Structures DB
Properties DB
Structure + Properties Knowledge + Prediction
Automation & Remote interaction
Co-LaboratoryInteraction between users & “Dark Labs”
Simulation and
calculation
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Comb-e-Chem Project - Comb-e-Chem Project - AutomationAutomation
X-Raye-Lab
Analysis
Properties
Propertiese-Lab
SimulationVideo
Diff
ract
omet
er
Grid
StructuresDatabase
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
HPC
HPCAnalysis
Storage
Storage
Analysis
Experiment
ExperimentComputing
HPC
Scientist
Scientist at the Centre of an Information WebBy access variable and difficult
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
The Future
The Grid Model - Information Utilities
Uniform access
MIDLEWARE
Experiment
Experiment
Computing
Computing
Computing
Storage
Storage
Storage
Analysis
Analysis
Scientist
Remember that you contribute to other people’s information web
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
End - to - end connectivityEnd - to - end connectivity
• Provide the smooth connection between Provide the smooth connection between the sources of data & informationthe sources of data & information
• From literature to the laboratory bench From literature to the laboratory bench and back via all stages of analysis and and back via all stages of analysis and discussiondiscussion
• Thus the need for a Data Grid or GridsThus the need for a Data Grid or Grids
• Al steps need to be Grid awareAl steps need to be Grid aware
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Plan & COSHH
Digital Model
InformationIntegration
Report
Knowledge
Goal
Literature
Synthesis
Smart Laboratory
Analysis
Generate information within & for the grid context
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
0
0.2
0.4
0.6
0.8
1
1.2
1 15 29 43 57 71 85 99 113
127
141
155
169
183
197
211
225
239
SN
O
NH
NH2
O Variety of data
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
The GridThe Grid• Grid is needed because Grid is needed because
– Complexity of dataComplexity of data– Volume of data (real time data, images, Volume of data (real time data, images,
video)video)– Scale of computation (analysis, simulation)Scale of computation (analysis, simulation)– Complexity of process (automation)Complexity of process (automation)– Variable demands on computationVariable demands on computation– Provenance (audit trials, timestamps, Provenance (audit trials, timestamps,
process)process)
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Dissemination & Publication
•A different approach is required to provide data to the community
•The grid provides the necessary medium
•What & How do we want to make available
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Journals: Journals: Publication @ sourcePublication @ source
JournalJournal
Materials
Database
Multimedia
Laboratory Data
Paper
“Full” record
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Data TrailData Trail
• Drill down through the analysis path Drill down through the analysis path
• Look at increasingly raw dataLook at increasingly raw data
• Often large expansion in quantity Often large expansion in quantity and variety at each stageand variety at each stage
• Need URIs for everythingNeed URIs for everything
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Publication@SourcePublication@Source• Must be able to track back to the original dataMust be able to track back to the original data
• Primary reason is to allow new analysis in the Primary reason is to allow new analysis in the future by other researchers.future by other researchers.
• In a university environment this may be In a university environment this may be viewed as a public responsibility in business viewed as a public responsibility in business environment ensuring maximum value from environment ensuring maximum value from investment.investment.
• Does have implications for provenance and Does have implications for provenance and even fraud!even fraud!
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Publication ChainPublication Chain
Institution Laboratory
Student
Journal
Bibliography
Professional Body Archive
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Sample
Raw images
Processed diffractionpattern
Structure
CIF Database
Validation
Journal
Synthesis
Smart Labs NCS Archive
CCDC
metadataAutomated structuredetermination
Jeremy FreyJeremy FreySept 2003Sept 2003
Comb-e-Chem
Chemical Crystallography: A Chemical Crystallography: A Suitable Case for OA TherapySuitable Case for OA Therapy
Mike HursthouseMike HursthouseDepartment of Chemistry and Department of Chemistry and
Combinatorial Centre of Excellence,Combinatorial Centre of Excellence,
EPSRC National Service for EPSRC National Service for Crystallography Crystallography
University of Southampton, UKUniversity of Southampton, UK
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
• Characterisation technique for Chemical Characterisation technique for Chemical Structure. Structure.
• Use XRD.Use XRD.• Provides high level of chem knowledgeProvides high level of chem knowledge• Structure – molecular or crystalStructure – molecular or crystal• Previously focussed on molecular structure – Previously focussed on molecular structure –
chemical propschemical props• Now focus on crystal structure – physical propsNow focus on crystal structure – physical props• Change in interest facilitated by availability of Change in interest facilitated by availability of
database archive.database archive.• However, woefully incompleteHowever, woefully incomplete
ChemCryst
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
• Database Archive – ca 300000 entries – allDatabase Archive – ca 300000 entries – all
• published structurespublished structures
• >10M chemical compounds known>10M chemical compounds known
• Probably 1.5M structures knownProbably 1.5M structures known
• Why shortfall? Archaic publishing methods.Why shortfall? Archaic publishing methods.
• Solution? Solution?
ChemCryst
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
• ChemCryst results New dissemination strategyChemCryst results New dissemination strategy
• E-Prints of “Structure Reports”E-Prints of “Structure Reports”
• Can be created automatically.Can be created automatically.
• Work can be validated automatically.Work can be validated automatically.
• All data (raw, processed, meta…) included.All data (raw, processed, meta…) included.
• Hence bypass Journal sponsored “refereeing Hence bypass Journal sponsored “refereeing
• Still need to decide on “publication” of “science”Still need to decide on “publication” of “science”
ChemCryst
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
ee-Bank Project JISC project with -Bank Project JISC project with UKOLNUKOLN
• Link comb-Link comb-ee-chem and other -chem and other semantic grid science projects to the semantic grid science projects to the e-print system at Southamptone-print system at Southampton
• Provide dissemination and Provide dissemination and provenanceprovenance
19 Feb 2004 OAI Meeting19 Feb 2004 OAI Meeting Jeremy G. Frey & Mike HursthouseJeremy G. Frey & Mike Hursthouse
Changing the way we workChanging the way we work
DataProvenance
QuantumMechanical
AnalysisPropertiesPrediction Data Mining,
QSAR, etc Design ofExperiment
E-Lab:Combinatorial
Synthesis
E-Lab:Properties
Measurement
E-Lab:X-Ray
Crystallography
LaboratoryProcesses
LaboratoryProcesses
StructuresDB
PropertiesDB
Data StreamingAuthorship/Submission
VisualisationAgent Assistant
LaboratoryProcesses
Samples Samples