Hello!. Ajit Kembhavi IUCAA Pune, India Virtual Observatories.

91
Hello!

Transcript of Hello!. Ajit Kembhavi IUCAA Pune, India Virtual Observatories.

Hello!

Ajit Kembhavi

IUCAA

Pune, India

Virtual Observatories

The Astronomer Vermeer 1632-1675

Data Storage and Retrieval

The Library of Alexandria 3rd Century BC

The Data Avalanche

Immense amounts of data are being produced by large telescopes using large area detectors.

Terabytes of data are now available, and Petabytes will soon be available from frequent all sky imaging.

Vast databases are also being produced through simulations.

Astronomical Data Explosion

P. Quinn

~ 100 Gb/night

Data Explosion

Peter Quinn

Wavelength Coverage

The data spans the electromagnetic spectrum from the radio to the gamma-ray region.

Obtaining, analysing and interpreting the data in different wavebands involves highly specialised instruments and techniques.

The astronomer needs new tools for using this wealth of data in multiwavelength studies.

Stars in the Milky Way

The Hertzsprung-Russell Diagram

The Alliance

Members of the IVOA

Interactions

Virtual Observatories

• Provide tools for data analysis, visualization and mining.

• Develop interoperability concepts to make different databases seamless.

• Manage vast data resources and provide these on-line to astronomers and other users.

Empower astronomers by providing sophisticated query and computational tools, and computing grids for producing new science.

IVOA Technology Initiatives

The IVOA has identified six major technical initiatives to fulfill the scientific goal of the VO concept.

IVOA-LISTS

REGISTRIES: These collect metadata about data resources and information services into a queryable database. The registry is distributed. A variety of industry standards are being investigated.

DATA MODELS: This initiative aims to define the common elements of astronomical data structures and to provide a framework to describe their relationships.

UNIFORM CONTENT DESCRIPTORS: These will provide the common language for for metadata definitions for the VO.

DATA ACCESS LAYER: This provides a standardized access mechanisms to distributed data objects. Initial prototypes are a Cone Search Protocol and a simple Image Access Protocol.

VO QUERY LANGUAGE: This will provide a standard query language which will go beyond the limitations of SQL.

VOTable: This is an XML mark-up standard for astronomical tables.

Science Initiatives

• Many IVOA projects have active Science Working Groups consisting of astronomers from a broad cross-section of the community representing all wavelengths.

• The focus here is to develop a clear perception of the scientific requirements of a VO.

• Projects within the working groups will develop new capabilities for VO based analysis.

• This will enable the community to create new research programs and to publish their data and research in a more pervasive and scientifically useful manner.

A collaboration between IUCAA and PSPL,

with a grant from the Ministry of Communications and Information Technology

Virtual Observatory -India

IUCAA

Persistent Systems Pvt. Ltd., Pune

Virtual Observatory - India

Data Archives and Mirrors at VO-I

SDSS

2Mass

2DFGRS

2QZ

FIRST

NVSS

Vizier, Aladin, ADS

Chandra

Fast Computing

Four alpha server ES-45 nodes, each with 4 processors, each node with 8 GB RAM

Fast, Low latency interconnect Memory Channel Architecture

Trucluster clustering environment (Tru64 Unix, DecMPI, openMP)

VO-India Software Projects

VOPlot Visualizer for catalogue data VOTable C++ Parser VOTable Streaming writer Data Converters Fits Browser User interfaces and query tools Applications beyond astronomy

All tools have web-based and stand alone versions

The VOPlot Collaboration

Visualization and simple statistics of catalogue data. Integration with sky atlases.

The VOPlot Tool

VOPlot

A VO-I + CDS collaboration

First conceived as a web-based tool for Vizier

Then integrated with Aladin

VOPlot is now also a stand alone system

It has been integrated with many data bases

Sonali Kale, K.D. Balaji et. Al.

Colour-magnitude diagramColour-magnitude diagram

parallax

Catalog Data Interface Tool

A tool to query catalog data.

Simple, customizable, graphic interface.

Not specific to type of data or catalogue.

SQL queries for expert users.

VO tools available for analysis:

VOPlot, Aladin, VOStat, SIMBAD, NED...

Data Organization and Architecture

Browse Server Database

Back

Create Views

Back

On-the-fly GUI

Back

Query using a Form

Back

Query using SQL Directly

Back

Results in VOPlot

Back

Results in Aladin

Back

Himalaya Chandra Telescope Data Archives

SDSS J125637-022452

High proper motion L-subdwarf

Optical spectra of mixed late M and mid L type

Only the third L subdwarf known

Positions 1986-2000

Proper motion

0.617 arcsec / yr

Thank You

AVO Prototype Demo

Astrogrid:

Astronomy Catalogue Extractor

AVO: Aladin+SED

VO-India:VOPlot

FITS Manager

View, create and add to FITS filesConvert to other formats

Pallavi Kulkarni

Fits-manager

VOTable Java Streaming Writer

Acts on a data array in memoryto convert it to the VOTable form, which is streamed row by rowto an output file. Very large VOTables can be written without excessive memory.

Pallavi Kulkarni

VOTable-Java

VOTable

• This is a new data exchange standard produced through efforts led by Francois Ochsenbien of CDS, Strasbourg and Roy Williams of Caltech.

• VOTable is in XML format. Physical quantities come with sophisticated semantic information.

VOTable

• The format enables computers to easily parse the information and communicate it to other computers.

• Federation and joining of information become possible and Grid computing is easier.

• VOTable parsers have been developed in Perl, Java and C++.

• Enhancements and extensions are being considered.

Streaming Parser Non-streaming Parser

VOTable Data

The data part in a VOTable may be represented using one of three different formats:

– FITS : VOTable can be used either to encapsulate FITS files, or to re-encode the metadata.

– BINARY : Supported for efficiency and ease of programming, no FITS library is required, and the streaming paradigm is supported.

– TABLEDATA : Pure XML format for small tables.

C++ VOTable Parser

Motivation:

– Provide a library for API based access to VOTable files.

– APIs can be directly used to develop VOTable applications without having to do raw VOTable processing.

– Streaming and Non-streaming versions are available.

Sonali Kale, Sudip Khanna

C++ VOTable Parser

Salient Features:• Implemented as a wrapper over XALAN-C+

+.• XALAN-C++ is a robust implementation of

the W3C recommendations for XSL Transformations (XSLT) and the XML Path language (XPath).• XPath queries can be used to access the

VOTable data.

Project DesignProject Design

VTable Metadata

Field

Link Collection

Field Collection

Link

Values

LinkLink Collection

Option Collection

maximum

minimum

Row

Row Collection

Table Data

Column Collection

Options

IUCAA HPC Facility Hercules

IUCAA HPC Facility Hercules

• Four Alpha Server ES-45 machines

• Each with 4 processors Alpha (21264C)

•1.25 GHz clock speed

• Cache on chip: 64 Kb –I, 64 Kb-D

• Cache : 16 Mb ECC DDR

• RAM 3 x 8 Gb + 12 Gb

• Fast, Low latency interconnect

• Memory channel Architecture (MCA)

• High volume Storage

• 1 Tera-byte SCSCI

•Trucluster clustering environment (Tru64 Unix, DecMPI, openMP)

ES-45Specfp2000: 1327

Linpack 1000x1000: 6847

Co-proposed by :

Ajit Kembhavi

T. Padmnabhan

Tarun Souradeep

HPC Team :

Sarah Ponthratnam

Sunu Engineer

Rajesh Nayak

Anand Sengupta

> 30 G flops

Preliminary HPL benchmark

Virtual Observatory - India

IUCAA

Persistent Systems

NVO-People

Caltech, Fermilab, JHU, NASA/HEARC, Microsoft, NCSA/UIUC, NOAO, NRAO, Raytheon ITS, SDSC/UCSD, SAO/CXC, STScI, UPenn, UPitts/CMU, UWis, USC, USNO, USRA, CVO

Ajit Kembhavi

Inter-University Centre for Astronomy and Astrophysics

Pune, India

Virtual Observatory - India

Virtual Observatories

• Provide tools for data analysis, visualization and mining.

• Develop interoperability concepts to make different databases seamless.

• Manage vast data resources and provide these on-line to astronomers and other users.

• Empower astronomers by providing sophisticated query and computational tools, and computing grids for producing new science.

Terapix

Jodrell Bank

Registry and DIS

High Volume Storage

Raid 5, 4 Terabyte

CVO Collaborations

• There are three major projects at the CVO involving collaborations with other VO.

• CVO is collaborating with the German Astrophysical VO to incorporate ROSAT X-ray data and catalogues into the CVO system.

• CVO is collaborating with the Australian VO.to incorporate 2Qz and 2DF galaxy spectra into the CVO database.

• CVO is an associate member of NVO and is have put in place some components of the NVO galaxy morphology demo.

Science Initiatives

• Many IVOA projects have active Science Working Groups consisting of astronomers from a broad cross-section of the community representing all wavelengths.

• The focus here is to develop a clear perception of the scientific requirements of a VO.

• Projects within the working groups will develop new capabilities for VO based analysis.

• This will enable the community to create new research programs and to publish their data and research in a more pervasive and scientifically useful manner.

Australian –VO Collaborations

• The distributed volume renderer (dvr) software, is a tool for rendering large volumetric data sets using the combined memory and processing resources of Beowulf like clusters.

• A collaboration between the Melbourne site of Aus-VO and AstroGrid aims to develop the existing dvr software into a grid-based volume rendering service.

• Users will be able to select FITS-format cubes from a number of "Data Centres",have the data transferred to a chosen rendering cluster, and then proceed to visualise the volume of data remotely (See Demo).

C++ VOTable Parser

• Initial version

- Released on May 31st , 2002.

- Support only for reading of tables.

- Support only for pure-XML TABLEDATA and not for BINARY or FITS data streams.

- Runs on Windows NT 4.0, Windows 2000 and

RedHat Linux 7.1.

• Future enhancements

- Can be incorporated quickly and efficiently.

Parser Design

Class Details • VTable: In memory representation of a single <TABLE> from the <RESOURCE> element in VOTable• TableMetaData: Contains MetaData (Fields, Links and Description)• Resource: Represents the <RESOURCE> element in the VOTable. • TableData: Contains Rows • Field: Representation of <FIELD> from VOTable • Row: Representation of <TR> from VOTable • Column: Representation of <TD> from VOTable

Parser Design

API – Typical Operations • File Level I/O Routines

– Open VOTable file – Close VOTable file

• Table I/O Operations – Get number of rows – Get number of columns – Get column(field) information (column name, column number,

etc.)– Accessing table data

Parser Implementation

• Development on Windows NT 4.0 platform using VC++. Ported to RedHat Linux 7.1/gcc-2.96 with zero effort.

• 18 C++ classes representing various elements of the VOTable format.

• 8500 lines of C++ code written for V1.1 release• Project start date: April 7th 2002• V1.1 Release: May 31st 2002• Current status: V1.2 design in progress

What is in Release V1.1What is in Release V1.1

Parser to serve as a building block for developing VOTable based applications.

Can be easily used by users of CFITSIO library. Supports powerful XPath queries against

VOTable files. The first version of the VO Table parser can now

be downloaded:

http://vo.iucaa.ernet.in/~voi/html/infopage.html

VOTable Parser DemoVOTable Parser Demo

Serves as a tutorial to help understand the basic APIs provided by the VOTable parser.

Demonstrates how to access the data and metadata elements of a VOTable file.

Future Work

• Develop APIs for writing data in VOTable format.

• Develop APIs for supporting IMAGE data and FITS files in VOTable.

• Enhance existing API set to allow more elaborate and flexible operations on VOTable files.

• Support future VOTable versions.• Develop applications for conversion between

FITS and VOTable formats.

References

• The first version of the C++ parser can now be downloaded from the VO-India website

http://vo.iucaa.ernet.in/~voi• VOTable Details:

http://vizier.u-strasbg.fr/doc/VOTable/• XALAN

http://xml.apache.org/xalan-c/index.html• XPATH

http://www.w3.org/TR/xpath

Virtual Observatory - India

SDSS Data Features

Size : 900 Gb

DBMS : Microsoft SQL (MS-SQL)

Data Contains : 1) Spectroscopic data 2) Tilling data

Search MS-SQL Database

Process Query Submit Query/Request

Output

Output : 1) HTML 2) XML 3) CSV

MS-SQL Server

User

User Interface Client

SDSS Query Architecture

Data Catalogs & Web services at IUCAA Catalogs Catalog

Description

2dfQSO

Size : 4 MB

2dfGRS

Size : 4 GB Organized as mSQL

2MASS

Size : 42 GB

Sky Survey

Size : 13 GB

FIRST

Size : 192 GB

Web Services

1) VizieR Services

The most complete library of astronomical catalogues (e.g. Guide Star catalogues, USNO-BI.

Tools to select, extract, format records matching a certain criteria.

2) Anglo-Australian 2DF System

Query Tool to select records from the 2DF catalogue. Display Skymap & Spectrum (FITS) of objects in 2DF catalogue.

Star Positions

• REGISTRIES: These collect metadata about data resources and information services into a queryable database. The registry is distributed. A variety of industry standards are being investigated.

• DATA MODELS: This initiative aims to define the common elements of astronomical data structures and to provide a framework to describe their relationships.

• UNIFORM CONTENT DESCRIPTORS: These will provide the common language for for metadata definitions for the VO.

Data Catalogs & Web services at IUCAA Catalogs Catalog

Description

2dfQSO

Size : 4 MB

2dfGRS

Size : 4 GB Organized as mSQL

2MASS

Size : 42 GB

Sky Survey

Size : 13 GB

FIRST

Size : 192 GB

Web Services

1) VizieR Services

The most complete library of astronomical catalogues (e.g. Guide Star catalogues, USNO-B1)

Tools to select, extract, format records matching certain criteria.

2) Anglo-Australian 2DF System

Query Tool to select records from the 2DF catalogue. Display Skymap & Spectrum (FITS) of objects in 2DF catalogue.

SDSS Data Features

Size : 900 GB

DBMS : Microsoft SQL (MS-SQL)

Contains : Spectroscopic data Tiling data

Search MS-SQL Database

Process Query Submit Query/Request

Output

Output : 1) HTML 2) XML 3) CSV

MS-SQL Server

User

User Interface Client

SDSS Query Architecture

VO Schema