Semantic Web Cyberinfrastructure for Virtual Observatories Deborah L. McGuinness Acting Director and...
-
Upload
annice-morrison -
Category
Documents
-
view
216 -
download
0
Transcript of Semantic Web Cyberinfrastructure for Virtual Observatories Deborah L. McGuinness Acting Director and...
Semantic Web Cyberinfrastructure for Virtual Observatories
Deborah L. McGuinnessActing Director and Senior Research Scientist
Knowledge Systems, AI LaboratoryStanford University
[email protected]://www.ksl.stanford.edu/people/dlm
CEO McGuinness Associates
Peter Fox*, Luca Cinquini%, James Benedict$, Patrick West*, Jose Garcia*, Tony Darnell*, and Don Middleton%
*HAO/ESSL/National Center for Atmospheric Research%SCD/CISL/National Center for Atmospheric Research
$McGuinness Associates.
Work funded by NSF and NASA
December 14, 2006 Deborah L. McGuinness 2
Virtual Observatories
Scientists should be able to access a global, distributed knowledge base of scientific data that:• appears to be integrated• appears to be locally available
But… data is obtained by multiple instruments, using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed
December 14, 2006 Deborah L. McGuinness 3
Virtual Observatories Definitions
• Workshop: A Virtual Observatory (VO) is a suite of software applications on a set of computers that allows users to uniformly find, access, and use resources (data, software, document, and image products and services using these) from a collection of distributed product repositories and service providers. A VO is a service that unites services and/or multiple repositories.
• VxyOs – x and y are two distinct disciplines• Semantically-Enhanced VOs use semantic
technologies to enhance query formation, data access, and resource usage.
December 14, 2006 Deborah L. McGuinness 4
Motivating Example• General: Find data subject to
certain constraints and plot appropriately
• Specific: Plot the observed/measured Neutral Temperature as recorded by the Millstone Hill Fabry-Perot interferometer while looking in the vertical direction at any time of high geomagnetic activity in a way that makes sense for the data.
December 14, 2006 Deborah L. McGuinness 7
Partial exposure of Instrument class hierarchy - users seem to LIKE THIS
December 14, 2006 Deborah L. McGuinness 8
Inferred plot type and return formats for data products
Inferred plot type and return required axes data
December 14, 2006 Deborah L. McGuinness 9
Leveraging Semantic Technologies• Reduced query formation from 8 steps to 3 and
reduced choices at each stage• Allowed scientists to get data from instruments they
never knew of before (e.g., photometers in example)
• Supported augmentation and validation of data• Useful and related data provided without having to
be an expert to ask for it• Integration and use (e.g. plotting) based on
inference• Ask and answer questions not possible before
December 14, 2006 Deborah L. McGuinness 10
Discussion• Our technical directions include:
– Provenance– Broader and deeper ontology-based applications– More use cases– Leveraging our infrastructure in other scientific domains
• Broader directions include– Building the GeoInformatics community (e.g., AGU town hall, scientific
informatics journals, …)– Reuse and outreach – other science disciplines – volcano, plate
tectonics, …, broader community – educational users, less trained public, …
– Spreading the changing science theme using semantic technologies to - use your data and tools but also remote colleague’s data and tools- understand assumptions, constraints, etc and evaluate applicability of work- find research that can benefit from your results- find other results that are consistent (or inconsistent) with yours
More info: [email protected] , [email protected]
December 14, 2006 Deborah L. McGuinness 13
Content: Coupling Energetics and Dynamics of Atmospheric Regions WEB
Community data archive for observations and models of Earth's upper atmosphere and geophysical indices and parameters needed to interpret them. Includes browsing capabilities by periods, instruments, models, …
December 14, 2006 Deborah L. McGuinness 14
Content: Mauna Loa Solar ObservatoryNear real-time data from Hawaii from a variety of solar instruments.
Source for space weather, solar variability, and basic solar physics
Other content used too – CISM – Center for Integrated Space Weather Modeling
December 14, 2006 Deborah L. McGuinness 15
Partial exposure of Instrument class hierarchy - users seem to LIKE THIS
Semantic filtering by domain or instrument hierarchy
December 14, 2006 Deborah L. McGuinness 16
Partial exposure of Instrument class hierarchy - users seem to LIKE THIS
December 14, 2006 Deborah L. McGuinness 18
General Design Experience• Many controlled vocabularies and taxonomies (few ontologies) as starting
points.• Strive for compatibility with “best practice” controlled vocabularies,
taxonomies, and ontologies.• Designed our own ontologies as dictated by use-case needs constantly with
the goal of reusability and extensibility. (Provided VSTO modules back to at least one ontology suite with a much broader scope.)
• Early design HIGHLY collaborative in design and implementation. The team included KR expert, domain experts, and SW engineers. Critical and continued contributions from domain scientists and knowledge representation.
• Resulting ontology is fairly extensible by entire team.• Prototype and initial deployment completed within 1st year (by small
cohesive, carefully chosen team)• Currently expanding to include more use cases that are being used to drive
ontology expansion.• Evaluating ontologies for broader use (volcanoes, climate, …)
December 14, 2006 Deborah L. McGuinness 19
VSTO Status
• Conceptual model and architecture developed by combined team; KR experts, domain experts, and software engineers
• Semantic framework developed and built with a small, cohesive, carefully chosen team in a relatively short time (deployments in 1st year)
• Production portal released, includes security, etc. with community migration (and so far endorsement)
• VSTO ontology version 0.4 available • Web Services encapsulation of semantic interfaces being
documented • Currently expanding to include more use cases that are being
used to drive ontology expansion.• Evaluating ontologies for broader use (volcanoes, climate, …)
December 14, 2006 Deborah L. McGuinness 20
Virtual Observatories in Practice
Make data and tools quickly and easily accessible to a wide audience.
They are likely to provide controlled vocabularies that may be used for interoperation in appropriate domains along with database interfaces for access and storage and “smart” search functions and tools for evolution and maintenance.
Our initial focus is on ontology-enhanced query formation, data access, and presentation over data, model, and tool archives.
December 14, 2006 Deborah L. McGuinness 21
• Scaling to large numbers of data providers• Crossing disciplines• Security, access to resources, policies• Branding and attribution (where did this data come
from and who gets the credit, is it the correct version, is this an authoritative source?)
• Provenance/derivation (propagating key information as it passes through a variety of services, copies of processing algorithms, …)
• Data quality, preservation, stewardship, rescue• Interoperability at a variety of levels (~3)
Issues for Virtual Observatories
December 14, 2006 Deborah L. McGuinness 22
Final remarks
• Many geoscience VOs are in production• Informatics efforts in Geosciences are exploding
– GeoInformatics Town Hall at Fall AGU meeting Dec. 11 2006 in San Francisco, many cyberinfrastructure sessions
– VO conference - April 2007 in Denver, CO– e-monograph to document state of VOs– NEW Journal of Earth Science Informatics– Special issue of Computers and Geosciences: “Knowledge
Representation in Earth and Space Science Cyberinfrastructure”
• Ongoing activities for VOs through 2008 under the auspices of the Electronic Geophysical Year (eGY; www.egy.org)
• Contact [email protected] , [email protected]
December 14, 2006 Deborah L. McGuinness 23
Some Observations about the Virtual Solar-Terrestrial Observatory
• Datasets alone are not sufficient to build a virtual observatory: VSTO integrates tools, models, and data
• VSTO (and all VOs) need to work with interdisciplinary metadata, multiple controlled vocabularies, and multiple interfaces
• VSTO leverages the development of schema that adequately describe the syntax (name of a variable, its type, dimensions, etc. or the procedure name and argument list, etc.), semantics (what the variable physically is, its units, etc.) and pragmatics (or what the procedure does and returns, etc.) of the datasets and tools.
• VSTO provides a basis for a framework for building and distributing advanced data assimilation tools
• Just gone live in two communities: CEDAR & Mauna Loa• Recent papers at ISWC ’06, OWL-ED 06, AGU spring and fall
’06, EGU ’06, Intl Astronomical Union ‘06
December 14, 2006 Deborah L. McGuinness 24
Why we were led to semantics
• When we integrate, we integrate concepts, terms• In the past we would ask, guess, research a lot, or give up• It’s pretty much about meaning• Semantics can really help find, access, integrate, use,
explain, trust…• What if you…
- could not only use your data and tools but remote colleague’s data and tools?
- understood their assumptions, constraints, etc and could evaluate applicability?
- knew whose research currently (or in the future) would benefit from your results?
- knew whose results were consistent (or inconsistent) with yours?…
December 14, 2006 Deborah L. McGuinness 25
Compilation of distribution of volcanic ash associated with large eruptions. Note the continental scale ash fall associated with Yellowstone eruption ~600,000 years ago.
Geologic databases provide the information about the magnitude of the eruption, and its impact on atmospheric chemistry and reflectance associated with particulate matter requires integration of concepts that bridge terrestrial and atmospheric ontologies.
Courtesy: Krishna Sinha
December 14, 2006 Deborah L. McGuinness 26
NASA Application
• One trend in science: moving from instrument- based to measurement-based
• Requires: ‘bridging the discipline data divide’• Overall vision for SESDI: To integrate information
technology in support of advancing measurement-based processing systems for NASA by integrating existing diverse science discipline and mission-specific data sources.
SWEET
Volcano Climate
SESDI
December 14, 2006 Deborah L. McGuinness 30
Impact: Virtual Observatories Changing Science
Scientists: What if you…- could not only use your data and tools but remote colleague’s data and tools?- understood their assumptions, constraints, etc and could evaluate
applicability?- knew whose research currently (or in the future) would benefit from your
results?- knew whose results were consistent (or inconsistent) with yours?…
Funders/Managers: What if you …- could identify how one research effort would support other efforts?- (and your fundees/employees) could reuse previous results?- (and your fundees/employees) could really interoperate?
CS: What if you…- could apply your techniques across very large distributed teams of people
with related but different apps?- could compare your techniques with colleagues trying to solve similar
problems?
December 14, 2006 Deborah L. McGuinness 31
ConclusionSemantic Web languages and tools are evolving and currently are
enabling next generation collaboration and applicationsSome examples here include support for
- explainable question answering systems- semantic integration of information- trustable applications- usable, integrated, explainable virtual observatories
For more info on talk topics:- Inference Web - iw.stanford.edu (OWL - www.w3.org/TR/owl-features/ )- Virtual Solar Terrestrial Observatory- www.vsto.org - AGU Session on Earth and Space Science Cyberinfrastructure
www.agu.org/meetings/fm06/?content=search&show=detail&sessid=392 - AGU Town Hall on Cyberinfrastructure http://www.agu.org/meetings/fm06/
December 14, 2006 Deborah L. McGuinness 32
Virtual Solar Terrestrial Observatory (VSTO)
• a distributed, scalable education and research environment for searching, integrating, and analyzing observational, experimental, and model databases.
• subject matter covers the fields of solar, solar-terrestrial and space physics
• it provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use
• 3 year NSF-funded project just beginning the second year
December 14, 2006 Deborah L. McGuinness 33
Virtual Observatories in Practice
Make data and tools quickly and easily accessible to a wide audience.
Operationally, virtual observatories need to find the right balance of data/model holdings, portals and client software that a researchers can use without effort or interference as if all the materials were available on his/her local computer using the user’s preferred language.
They are likely to provide controlled vocabularies that may be used for interoperation in appropriate domains along with database interfaces for access and storage and “smart” search functions and tools for evolution and maintenance.
December 14, 2006 Deborah L. McGuinness 34
Virtual Solar Terrestrial Observatory (VSTO)
• a distributed, scalable education and research environment for searching, integrating, and analyzing observational, experimental, and model databases.
• subject matter covers the fields of solar, solar-terrestrial and space physics
• it provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use
• 3 year NSF-funded project just beginning the second year