Post on 24-Dec-2015
The Astronomer Vermeer 1632-1675
Data Storage and Retrieval
The Library of Alexandria 3rd Century BC
The Data Avalanche
Immense amounts of data are being produced by large telescopes using large area detectors.
Terabytes of data are now available, and Petabytes will soon be available from frequent all sky imaging.
Vast databases are also being produced through simulations.
Wavelength Coverage
The data spans the electromagnetic spectrum from the radio to the gamma-ray region.
Obtaining, analysing and interpreting the data in different wavebands involves highly specialised instruments and techniques.
The astronomer needs new tools for using this wealth of data in multiwavelength studies.
Virtual Observatories
• Provide tools for data analysis, visualization and mining.
• Develop interoperability concepts to make different databases seamless.
• Manage vast data resources and provide these on-line to astronomers and other users.
Empower astronomers by providing sophisticated query and computational tools, and computing grids for producing new science.
IVOA Technology Initiatives
The IVOA has identified six major technical initiatives to fulfill the scientific goal of the VO concept.
IVOA-LISTS
REGISTRIES: These collect metadata about data resources and information services into a queryable database. The registry is distributed. A variety of industry standards are being investigated.
DATA MODELS: This initiative aims to define the common elements of astronomical data structures and to provide a framework to describe their relationships.
UNIFORM CONTENT DESCRIPTORS: These will provide the common language for for metadata definitions for the VO.
DATA ACCESS LAYER: This provides a standardized access mechanisms to distributed data objects. Initial prototypes are a Cone Search Protocol and a simple Image Access Protocol.
VO QUERY LANGUAGE: This will provide a standard query language which will go beyond the limitations of SQL.
VOTable: This is an XML mark-up standard for astronomical tables.
Science Initiatives
• Many IVOA projects have active Science Working Groups consisting of astronomers from a broad cross-section of the community representing all wavelengths.
• The focus here is to develop a clear perception of the scientific requirements of a VO.
• Projects within the working groups will develop new capabilities for VO based analysis.
• This will enable the community to create new research programs and to publish their data and research in a more pervasive and scientifically useful manner.
A collaboration between IUCAA and PSPL,
with a grant from the Ministry of Communications and Information Technology
Virtual Observatory -India
Fast Computing
Four alpha server ES-45 nodes, each with 4 processors, each node with 8 GB RAM
Fast, Low latency interconnect Memory Channel Architecture
Trucluster clustering environment (Tru64 Unix, DecMPI, openMP)
VO-India Software Projects
VOPlot Visualizer for catalogue data VOTable C++ Parser VOTable Streaming writer Data Converters Fits Browser User interfaces and query tools Applications beyond astronomy
All tools have web-based and stand alone versions
The VOPlot Collaboration
Visualization and simple statistics of catalogue data. Integration with sky atlases.
The VOPlot Tool
VOPlot
A VO-I + CDS collaboration
First conceived as a web-based tool for Vizier
Then integrated with Aladin
VOPlot is now also a stand alone system
It has been integrated with many data bases
Sonali Kale, K.D. Balaji et. Al.
Catalog Data Interface Tool
A tool to query catalog data.
Simple, customizable, graphic interface.
Not specific to type of data or catalogue.
SQL queries for expert users.
VO tools available for analysis:
VOPlot, Aladin, VOStat, SIMBAD, NED...
SDSS J125637-022452
High proper motion L-subdwarf
Optical spectra of mixed late M and mid L type
Only the third L subdwarf known
FITS Manager
View, create and add to FITS filesConvert to other formats
Pallavi Kulkarni
Fits-manager
VOTable Java Streaming Writer
Acts on a data array in memoryto convert it to the VOTable form, which is streamed row by rowto an output file. Very large VOTables can be written without excessive memory.
Pallavi Kulkarni
VOTable-Java
VOTable
• This is a new data exchange standard produced through efforts led by Francois Ochsenbien of CDS, Strasbourg and Roy Williams of Caltech.
• VOTable is in XML format. Physical quantities come with sophisticated semantic information.
VOTable
• The format enables computers to easily parse the information and communicate it to other computers.
• Federation and joining of information become possible and Grid computing is easier.
• VOTable parsers have been developed in Perl, Java and C++.
• Enhancements and extensions are being considered.
Streaming Parser Non-streaming Parser
VOTable Data
The data part in a VOTable may be represented using one of three different formats:
– FITS : VOTable can be used either to encapsulate FITS files, or to re-encode the metadata.
– BINARY : Supported for efficiency and ease of programming, no FITS library is required, and the streaming paradigm is supported.
– TABLEDATA : Pure XML format for small tables.
C++ VOTable Parser
Motivation:
– Provide a library for API based access to VOTable files.
– APIs can be directly used to develop VOTable applications without having to do raw VOTable processing.
– Streaming and Non-streaming versions are available.
Sonali Kale, Sudip Khanna
C++ VOTable Parser
Salient Features:• Implemented as a wrapper over XALAN-C+
+.• XALAN-C++ is a robust implementation of
the W3C recommendations for XSL Transformations (XSLT) and the XML Path language (XPath).• XPath queries can be used to access the
VOTable data.
Project DesignProject Design
VTable Metadata
Field
Link Collection
Field Collection
Link
Values
LinkLink Collection
Option Collection
maximum
minimum
Row
Row Collection
Table Data
Column Collection
Options
IUCAA HPC Facility Hercules
IUCAA HPC Facility Hercules
• Four Alpha Server ES-45 machines
• Each with 4 processors Alpha (21264C)
•1.25 GHz clock speed
• Cache on chip: 64 Kb –I, 64 Kb-D
• Cache : 16 Mb ECC DDR
• RAM 3 x 8 Gb + 12 Gb
• Fast, Low latency interconnect
• Memory channel Architecture (MCA)
• High volume Storage
• 1 Tera-byte SCSCI
•Trucluster clustering environment (Tru64 Unix, DecMPI, openMP)
ES-45Specfp2000: 1327
Linpack 1000x1000: 6847
Co-proposed by :
Ajit Kembhavi
T. Padmnabhan
Tarun Souradeep
HPC Team :
Sarah Ponthratnam
Sunu Engineer
Rajesh Nayak
Anand Sengupta
> 30 G flops
Preliminary HPL benchmark
NVO-People
Caltech, Fermilab, JHU, NASA/HEARC, Microsoft, NCSA/UIUC, NOAO, NRAO, Raytheon ITS, SDSC/UCSD, SAO/CXC, STScI, UPenn, UPitts/CMU, UWis, USC, USNO, USRA, CVO
Ajit Kembhavi
Inter-University Centre for Astronomy and Astrophysics
Pune, India
Virtual Observatory - India
Virtual Observatories
• Provide tools for data analysis, visualization and mining.
• Develop interoperability concepts to make different databases seamless.
• Manage vast data resources and provide these on-line to astronomers and other users.
• Empower astronomers by providing sophisticated query and computational tools, and computing grids for producing new science.
CVO Collaborations
• There are three major projects at the CVO involving collaborations with other VO.
• CVO is collaborating with the German Astrophysical VO to incorporate ROSAT X-ray data and catalogues into the CVO system.
• CVO is collaborating with the Australian VO.to incorporate 2Qz and 2DF galaxy spectra into the CVO database.
• CVO is an associate member of NVO and is have put in place some components of the NVO galaxy morphology demo.
Science Initiatives
• Many IVOA projects have active Science Working Groups consisting of astronomers from a broad cross-section of the community representing all wavelengths.
• The focus here is to develop a clear perception of the scientific requirements of a VO.
• Projects within the working groups will develop new capabilities for VO based analysis.
• This will enable the community to create new research programs and to publish their data and research in a more pervasive and scientifically useful manner.
Australian –VO Collaborations
• The distributed volume renderer (dvr) software, is a tool for rendering large volumetric data sets using the combined memory and processing resources of Beowulf like clusters.
• A collaboration between the Melbourne site of Aus-VO and AstroGrid aims to develop the existing dvr software into a grid-based volume rendering service.
• Users will be able to select FITS-format cubes from a number of "Data Centres",have the data transferred to a chosen rendering cluster, and then proceed to visualise the volume of data remotely (See Demo).
C++ VOTable Parser
• Initial version
- Released on May 31st , 2002.
- Support only for reading of tables.
- Support only for pure-XML TABLEDATA and not for BINARY or FITS data streams.
- Runs on Windows NT 4.0, Windows 2000 and
RedHat Linux 7.1.
• Future enhancements
- Can be incorporated quickly and efficiently.
Parser Design
Class Details • VTable: In memory representation of a single <TABLE> from the <RESOURCE> element in VOTable• TableMetaData: Contains MetaData (Fields, Links and Description)• Resource: Represents the <RESOURCE> element in the VOTable. • TableData: Contains Rows • Field: Representation of <FIELD> from VOTable • Row: Representation of <TR> from VOTable • Column: Representation of <TD> from VOTable
Parser Design
API – Typical Operations • File Level I/O Routines
– Open VOTable file – Close VOTable file
• Table I/O Operations – Get number of rows – Get number of columns – Get column(field) information (column name, column number,
etc.)– Accessing table data
Parser Implementation
• Development on Windows NT 4.0 platform using VC++. Ported to RedHat Linux 7.1/gcc-2.96 with zero effort.
• 18 C++ classes representing various elements of the VOTable format.
• 8500 lines of C++ code written for V1.1 release• Project start date: April 7th 2002• V1.1 Release: May 31st 2002• Current status: V1.2 design in progress
What is in Release V1.1What is in Release V1.1
Parser to serve as a building block for developing VOTable based applications.
Can be easily used by users of CFITSIO library. Supports powerful XPath queries against
VOTable files. The first version of the VO Table parser can now
be downloaded:
http://vo.iucaa.ernet.in/~voi/html/infopage.html
VOTable Parser DemoVOTable Parser Demo
Serves as a tutorial to help understand the basic APIs provided by the VOTable parser.
Demonstrates how to access the data and metadata elements of a VOTable file.
Future Work
• Develop APIs for writing data in VOTable format.
• Develop APIs for supporting IMAGE data and FITS files in VOTable.
• Enhance existing API set to allow more elaborate and flexible operations on VOTable files.
• Support future VOTable versions.• Develop applications for conversion between
FITS and VOTable formats.
References
• The first version of the C++ parser can now be downloaded from the VO-India website
http://vo.iucaa.ernet.in/~voi• VOTable Details:
http://vizier.u-strasbg.fr/doc/VOTable/• XALAN
http://xml.apache.org/xalan-c/index.html• XPATH
http://www.w3.org/TR/xpath
SDSS Data Features
Size : 900 Gb
DBMS : Microsoft SQL (MS-SQL)
Data Contains : 1) Spectroscopic data 2) Tilling data
Search MS-SQL Database
Process Query Submit Query/Request
Output
Output : 1) HTML 2) XML 3) CSV
MS-SQL Server
User
User Interface Client
SDSS Query Architecture
Data Catalogs & Web services at IUCAA Catalogs Catalog
Description
2dfQSO
Size : 4 MB
2dfGRS
Size : 4 GB Organized as mSQL
2MASS
Size : 42 GB
Sky Survey
Size : 13 GB
FIRST
Size : 192 GB
Web Services
1) VizieR Services
The most complete library of astronomical catalogues (e.g. Guide Star catalogues, USNO-BI.
Tools to select, extract, format records matching a certain criteria.
2) Anglo-Australian 2DF System
Query Tool to select records from the 2DF catalogue. Display Skymap & Spectrum (FITS) of objects in 2DF catalogue.
• REGISTRIES: These collect metadata about data resources and information services into a queryable database. The registry is distributed. A variety of industry standards are being investigated.
• DATA MODELS: This initiative aims to define the common elements of astronomical data structures and to provide a framework to describe their relationships.
• UNIFORM CONTENT DESCRIPTORS: These will provide the common language for for metadata definitions for the VO.
Data Catalogs & Web services at IUCAA Catalogs Catalog
Description
2dfQSO
Size : 4 MB
2dfGRS
Size : 4 GB Organized as mSQL
2MASS
Size : 42 GB
Sky Survey
Size : 13 GB
FIRST
Size : 192 GB
Web Services
1) VizieR Services
The most complete library of astronomical catalogues (e.g. Guide Star catalogues, USNO-B1)
Tools to select, extract, format records matching certain criteria.
2) Anglo-Australian 2DF System
Query Tool to select records from the 2DF catalogue. Display Skymap & Spectrum (FITS) of objects in 2DF catalogue.
SDSS Data Features
Size : 900 GB
DBMS : Microsoft SQL (MS-SQL)
Contains : Spectroscopic data Tiling data
Search MS-SQL Database
Process Query Submit Query/Request
Output
Output : 1) HTML 2) XML 3) CSV
MS-SQL Server
User
User Interface Client
SDSS Query Architecture