Grid Astronomy with Image Federation Roy Williams Michael Feldmann California Institute of...
-
Upload
bret-jacka -
Category
Documents
-
view
220 -
download
3
Transcript of Grid Astronomy with Image Federation Roy Williams Michael Feldmann California Institute of...
Grid Astronomy with Image Federation Roy Williams
Michael Feldmann
California Institute of Technology
This Presentation
Why (Roy Williams)– Image Federation– Standard protocols, standard data types– Grid Services: eg. Simple Image Access (SIA)– Image Mosaicking with Atlasmaker
How (Mike Feldmann)– Condor – Queues– SRB– ????
1. NVO Image Service: Federation Needs Standards
NVO Image ProtocolSIAP
• Specify box by position and size• SIAP server returns relevant images
• Footprint• Logical Name• URL
Can choose:
standard URL:http://.......
SRB URLsrb://nvo.npaci.edu/…..
Simple Image Access Service
• Query is sky region• May query on image type, image geometry
• Response is VOTable of images• Each has WCS (geometry) parameters• Plus a URL to fetch the image
• Designed for• Set of pointed observations (eg Hubble)• Wide-area survey (eg Sloan)• Image service
– Mosaicking
– Reprojection
2. Atlasmaker: Grid-based Image Federation
Image Federation
Multispectral Imagery
Crab Nebula.3 channels: X-ray in blue, optical in green, and radio in red.
Moffet Field California. 224 channels from 400 nm to 2500 nm
Image Federation
detection
Stacking allows detection of faint sources. A 1-sigma detection in each of many bands becomes a 3-sigma detection.
Images of the same galaxy taken several days apart are automatically subtracted from one another, and remaining bright spots may be supernova candidates. (NEAT project)
Image subtraction allows detection of narrow-line features that are not also wide-band (eg Hα but not R-band)
Multi-Wavelength Image Morphology
DPOSS-2MASS Image Mosaics
J F N J H K J F N J H K
Galaxy identifcation, galaxy clustersPattern matching with shape AND color
Galaxy cluster in Xray, DPOSS, Sloan
This is M77 & has Xray and Radio tooComa
Virtual Skyhttp://virtualsky.org
Virtual Sky: Image Federation
Xray (ROSAT) theme
Change scale
Change theme
http://virtualsky.org/fromCaltech CACRCaltech AstronomyMicrosoft Research
Optical (DPOSS)
Coma cluster
Virtual Sky has14,000,000 tiles
140 Gbyte
Image Federation
Virtual Sky
Roy W
illiam
s, C
alte
ch C
AC
RA
lex S
zala
y, Jo
hns H
opkin
s Univ
ersity
Ash
ish M
ah
ab
al, C
alte
ch A
stronom
yJim
Gra
y, M
icroso
ft Rese
arch
Georg
e D
jorg
ovski, C
alte
ch A
stronom
yJu
lian B
un
n, C
alte
ch C
AC
RR
obert B
run
ner, C
alte
ch A
stronom
y
Multichannel Images
L is 23 cm wavelengthC is 10 cm wavelength
H is horizontal polarizationV is vertical polarization
A color imageis 3 channels
Principle components –Information concentration
Principle Components
SDSS (5 channel) SDSS+2MASS (8 channel)
Mosaicking and Federation
Every Astronomical image has a different projection
• different pointing of the telescope
• We want to mosaic different images• We want to federate different information
Compute intensive:flux in each pixel is carefully
distributed into a new pixel grid
Mosaicking
Federation
Infrared map
Xray map today
Xray map last year
AtlasmakerUses Montage, Yoursky
Project
Project Estimate & correct Background
Co-Add
Data
Chart
David H
ockney Pearblossom
Highw
ay 1986
Background Correction
Uncorrected Corrected
Montage Background Correction
Project pixels to output chart
Fit ramps on overlap regions
Fit ramps on projected images
Subtract from Pixel values
Images and Charts
Image• Big data
Chart• Map: sphere → plane• FITS-WCS header• small data
An atlas is a collection of chartsHyperatlas is an attempt to standardize atlases
HyperatlasStandard naming for atlases and vcharts
TM-5-SIN-20Vchart TM-5-SIN-20-1589
Standard Scales:scale s means 220-s arcseconds per pixel
SIN projection
TAN projection
TM-5 layout
HV-4 layout
Standard Projections
StandardLayout
Charts and Pages
1. Chart – a metadata entity specifying a map from sphere to plane
2. Page – a sized chart chosen from a standard set – an Atlas
The virtual disk is 400,000 pixels wide
SIN
projection
3. A Tiling of a Page
Over to Mike
Atlasmaker DAG (tile level)
An Atlasmaker tile is some region of the sky. Can retrieve needed images for a given page with SIAP services. Images returned by SIAP server for a page query establish dependency graph for the construction of this page. A tile is simply a subset of this page.
Atlasmaker tileSurvey images
Neighboring Atlasmaker tile
Survey images overlap multiple tiles
Atlasmaker DAG (tile to page)
•An Atlasmaker “page” is made up of many smaller “tiles”.•The page defines the “projection” (Montage)•A tile is a simple unit of work to be completed (easy to deal with computationally)
Page
Many tiles
Atlasmaker DAG (tile level)
Tile A
Tile B
survey images
Raw images on page border must be projected twice. Images on the border of two tiles of the same page do not need to be projected twice (this efficiency requires image product archival and good bookkeeping).
Atlasmaker DAG (page and up)
coarse resolution
page
man
y le
vels
res
olut
ion
......
......
... ... ... ...
... ... ... ...
......
......
......
......
... ... ... ...
......
......
... ... ... ...
high integrity images compressed images
Web
bas
ed
navi
gatio
n to
ol
(page-level DAG)
Atlasmaker DAG
Coarse resolution
Pages
Tiles
Projected Images
Raw Images
Web
bas
ed n
avig
atio
n to
ol
Raw images archivedLarge dataset management (SRB)
SIAP web service retrievalestablishes dependencies
Image compression
Montage
Projection defined
Reduce resolution
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Probe top compressed Image
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Probe all compressed Images
Atlasmaker DAG Construction
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Complete!!!
Something breaks!!!
• Some survey images might have been flawed
• A valid state might have been set by a failed computation
• New/better images now exist• We do not want to
recalculate entire atlas!
Oops!
Why Virtual Data?
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Discover an invalid state!!
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Propagate invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Fix invalid raw image
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Probe all compressed Images that are not in a “valid” state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Atlasmaker DAG Maintenance
Course resolution
Pages
Tiles
Projected Images
Raw Images
Valid state
Working state
Initial state
Invalid state
Fixed!!!
Computational Approaches
Course resolution
Pages
Tiles
Projected Images
Raw Images
Web
bas
ed n
avig
atio
n to
ol
Promising “Bag of Tasks” locations.
MW-based parallelism is very easy!!!
Serial Version
• First stab at Atlasmaker• Allowed us to build a base of python
modules used throughout this project• We ran small sections of sky on single
machine• Assumed common filesystem• We did our own bookkeeping• Can run on any machine• Slow!!!
MPI Version
• Take advantage of simple MW parallelism• Machine must have MPI • Assumed common filesystem• Typical run on tightly-coupled machines• MPI jobs needed to be small enough to
avoid being “brittle” (hardware failures?)• Generally limited to doing only small part
of DAG
PBS/MPI Version
• Typical run on tightly-coupled machines• Many smaller MPI jobs submitted to PBS• “Job-Manager” parallelism• Assumed common filesystem• Requires fair amount of bookkeeping• “Tile” is smallest unit of work• Typical to submit a “Tile” or “Page” as
single MPI job
PBS/Serial Version
• Not dependent on MPI• Assumed common filesystem• Just as (even more) efficient as MPI/PBS
parallelism• Can make a single image project the smallest
unit of work (*still requires tile assembly*)• Typically make a “Tile” a unit of work• Roughly the same amount of bookkeeping as
PBS/MPI version• Many small PBS jobs often get picked up quickly• System Administrators feel uncomfortable with
many small jobs from a single user• Queue policy often set against this usage
Condor/Serial Version
• Just like PBS/Serial version but submit all jobs to Condor job-manager instead of PBS job-manager
• Need machines with Condor• Requires learning Condor• Does not assume common filesystem• Condor can leverage huge number of idle
cycles (we are doing our part to keep very loosely-coupled applications off over-burdened tightly-coupled machines )
DAGMAN/Condor Version
• Requires using DAGMAN/Condor• Need machines with DAGMAN/Condor• We are using Condor discovered cycles• Requires learning DAGMAN/Condor• Does not assume common filesystem• All the bookkeeping is done for us!
Pros and ConsMethod Pros ConsSerial •Simple to code •Slow
•Assumes local filesystem
MPI •Parallel •Limit to job submission size•Limit to MPI supported machines•Uses tightly-coupled resource•Assumes common filesystem
PBS/Serial •Simple to code•“Job manager” parallelism
•Bookkeeping•Interactions with PBS not natural•Assumes common filesystem•Uses tightly-coupled resource
PBS/MPI •Roughly the same as “PBS/Serial”
•Bookkeeping•Limit to MPI supported machines•Assumes common filesystem•Uses tightly-coupled resource
Condor/Serial •Condor discovered cpu cycles•Does not assume common filesystem
•Bookkeeping•Limit to Condor supported machines
Condor/DAGMAN •Almost no bookkeeping•Condor discovered cpu cycles•Does not assume common filesystem
•Limit to Condor supported machines
Atlasmaker “to-do” list
• Complete/test of Condor version• Write DAGMAN version• Continued refinement of data retrieval• Establish useful meta-data for previously
projected images and image products• Get all surveys in reliable/redundant
locations• Construction entire atlas
Atlasmaker “wish list”
• Wider Condor/DAGMAN usage• 100% reliable low latency access to raw
data and constructed image products• Resource broker to tell me where to
submit my bag of tasks (in spirit of PBS/serial and MPI/serial versions)
Conclusions
• Computational approach depends on resources you can use
• Atlasmaker is still a work in progress