Bayesian Cross-Identification in Astronomyaws/Budavari_xid-2012-mpe_presented.pdfIN ASTRONOMY...
Transcript of Bayesian Cross-Identification in Astronomyaws/Budavari_xid-2012-mpe_presented.pdfIN ASTRONOMY...
BAYESIAN CROSS-IDENTIFICATION IN ASTRONOMY
Tamás Budavári / The Johns Hopkins University10/7/2012
Tamás Budavári
Recording Observations
Whirlpool Galaxy – M51
Discovered by Charles Messier (1773)
Sket
ch b
y W
illia
m P
ars
on
s (1
84
5)
10/7/2012
2
Tamás Budavári
Multicolor Universe
10/7/2012
3
Tamás Budavári
Eventful Universe
10/7/2012
4
Tamás Budavári
Trends in Astronomy
Exponential growth of data
Moore’s law in detectors
19701975
19801985
19901995
2000
0.1
1
10
100
1000
CCDs Glass
10/7/2012
5
Tamás Budavári
Sloan Digital Sky Survey
Cosmic Genome Project 2001-2010
500M rows, 400+ cols 18TB
30TB of images from 30 CCDs
Software revolution in astro
Astronomers learn SQL
Cannot look at the data anymore
10/7/2012
6
Tamás Budavári
The New Generations
Large Synoptic Survey Telescope [OPTICAL]
3 trillion rows, 200+ attributes, 100+ tables ~ 30PB
60PB of images, 3.2 Gpix cam
Square Kilometer Array [RADIO]
Processing limited
10/7/2012
7
10/7/20128
Tamás Budavári
Keeping Up?
Image processing
Catalog extraction O(n)
What is difficult? O(n log n)
O(n2), …
10/7/2012
9
Tamás Budavári
More is Different
10/7/2012
Lots of opportunities
Lots of challenges
Lots of problems
New approaches
New algorithms
New tools
New computers
10
Tamás Budavári
Do More at the Data
10/7/2012
Statistics on remote resources Fine-tune SQL Server for astronomy
Analyses in and driven by SQL
SDSS catalog archive and more GALEX, HLA, UKIDSS, PanSTARRS, …
11
Tamás Budavári
10/7/2012
12
Tamás Budavári
10/7/2012
13
Tamás Budavári
Storing Simulations
Millennium Run (MPA)
10 billion particles, 64 snapshots
FoF groups and merger trees
Millennium XXL 300 billion particles
MultiDark – Bolshoi
Turbulence simulations (JHU)
10244 grid, 27TB
10/7/2012
14
Tamás Budavári
Storing Simulations
Millennium Run (MPA)
10 billion particles, 64 snapshots
FoF groups and merger trees
Millennium XXL 300 billion particles
MultiDark – Bolshoi
Turbulence simulations (JHU)
10244 grid, 27TB
10/7/2012
15
Tamás Budavári
Observing Simulations
Comparison to real observations
Lots of spatial searches
In the database!Lemson, TB & Szalay (2011)
10/7/2012
16
Tamás Budavári
Fast Searches
10/7/2012
Map space-filling curves to database indices
Hierarchical Triangular Mesh – on the unit sphere
Peano-Hilbert / Morton curves – in 3D
Combine with query shapes
Build from primitives
17
Tamás Budavári
Fast Searches
10/7/2012
Map space-filling curves to database indices
Hierarchical Triangular Mesh – on the unit sphere
Peano-Hilbert / Morton curves – in 3D
Combine with query shapes
Build from primitives
18
Tamás Budavári
Mil
len
niu
m X
XL
19
Tamás Budavári
Understanding Subtleties
Interactive visualization of 27TB of turbulence sim
10/7/2012
Kai Bürger (TUM, JHU)
20
Tamás Budavári
The SDSS Genealogy
VO Services
Life Under Your Feet
OncoSpace CASJobs MyDB
SDSS SkyServer
Turbulence DB
Milky Way Laboratory
INDRA Simulation
SkyQuery
Open SkyQuery
MHD DB
JHU 1K Genomes
Pan-STARRS
HubbleLegacy Arch
VO Footprint VO Spectrum
Super COSMOS
Millennium
Potsdam
Palomar QUESTGALEX
GalaxyZoo
UKIDDS
TerraServer
By Alex Szalay
10/7/2012
21
Tamás Budavári
SkyQuery
10/7/2012
Dynamical federation of archives
22
One of the most fundamental analysis steps
Cross-Identification
10/7/2012
23
Tamás Budavári
What is the Right Question?
Cross-identification is a hard problem Computationally, Scientifically & Statistically
Need symmetric n-way solution
Need reliable quality measure
Same or not? Distance threshold? Maximum likelihood?
10/7/2012
24
Tamás Budavári
Modeling the Astrometry
Astrometric precision
A simple function
Where on the sky?
Anywhere really…
10/7/2012
25
Tamás Budavári
The Bayes factor
H: all observations of the same object at m
K: might be from separate objects at {mi}
Same or Not?
10/7/2012
SAM
EN
OT
OR
26
Tamás Budavári
The Bayes factor
H: all observations of the same object at m
K: might be from separate objects at {mi}
Same or Not?
10/7/2012
SAM
EN
OT
OR
27
Tamás Budavári
The Bayes factor
H: all observations of the same object at m
K: might be from separate objects at {mi}
Same or Not?
10/7/2012
SAM
EN
OT
OR
28
Tamás Budavári
The Bayes factor
H: all observations of the same object at m
K: might be from separate objects at {mi}
Same or Not?
On the skyAstrometry
10/7/2012
SAM
EN
OT
OR
29
Tamás Budavári
The Bayes factor
H: all observations of the same object at m
K: might be from separate objects at {mi}
Same or Not?
On the skyAstrometry
10/7/2012
SAM
EN
OT
OR
30
Tamás Budavári
Normal Distribution
Astrometric precision:
Fisher distribution:
Analytic results:
For high accuracies:
10/7/2012
31
Tamás Budavári
Analytic Results
Normal distribution
Flat and spherical
Gauss and Fisher
2-way results
10/7/2012
32
Tamás Budavári
Wikipedia: Interpretation
10/7/2012
33
Tamás Budavári
Bayes factor is the connection
From Priors to Posteriors
10/7/2012
34
Tamás Budavári
10/7/2012
From Priors to Posteriors
Posterior probability from prior & Bayes factor
Prior probability of a match Like dice in a bag: 1/N and N1 n
In general?
35
Tamás Budavári
From Priors to Posteriors
Different selections Nearby / Distant
Red / Blue
But only 1 number
36
Tamás Budavári
Prior has an unknown fudge-factor
Educated guess
Or solve for it:
Self-Consistent EstimatesTB
& Sza
lay
(20
08
)
10/7/2012
37
Tamás Budavári
Simulations
Mock objects With correct clustering U01 values as properties
Simulated sources Subsets: N1 N2
Overlap: N★
0 1
38
10/7/2012
Tamás Budavári
Simulations
Mock objects With correct clustering U01 values as properties
Simulated sources Subsets: N1 N2
Overlap: N★
0 1
10/7/2012
39
Tamás Budavári
Simulations
Quality Multiple matches
Explained by simple model of point sources!
Heinis, TB, Szalay (2009)
10/7/2012
40
Tamás Budavári
SkyQuery
10/7/2012
Almost purestandard SQL
41
Tamás Budavári
SkyQuery
10/7/2012
Almost purestandard SQL
42
Tamás Budavári
SkyQuery
10/7/2012
Almost purestandard SQL
Added XMATCH
Verifiable
Flexible
43
Tamás Budavári
Matching Events
10/7/2012
(1)
(2)
(x)
Streams of events in time and space
E.g., thresholded peaks in signal-to-noise
44
TB (2011)
Tamás Budavári
Matching Events
10/7/2012
(1)
(2)
(x)
Streams of events in time and space
E.g., thresholded peaks in signal-to-noise
45
TB (2011)
Tamás Budavári
Same hypotheses but different parameters
Just need prior to integrate
Proper MotionSo
urces fro
m SD
SS
10/7/2012
46
Tamás Budavári
Same hypotheses but different parameters
Just need prior to integrate
Proper Motion
Kerekes, TB+ (2010)
Sou
rces from
SDSS
10/7/2012
47
Radio Morphology
10/7/2012
48
Tamás Budavári
Example
Core match
But lobes?
overlay0113.png
49
10/7/2012
Tamás Budavári
Simplest Model for Radio Sources
A core and two symmetric lobes Core direction parameter m
Lobe direction parameter m’
Other lobe is at 2m m’
The prior is some p(m,m’) = p(m) p(m’|m) Parameter m anywhere on the sky: p(m)
But m’ is seen certain separations away: p(m’|m)
10/7/2012
50
Tamás Budavári
Simplest Model for Radio Sources
A core and two symmetric lobes Core direction parameter m
Lobe direction parameter m’
Other lobe is at 2m m’
The prior is some p(m,m’) = p(m) p(m’|m) Parameter m anywhere on the sky: p(m)
But m’ is seen certain separations away: p(m’|m)
10/7/2012
51
Tamás Budavári
Likelihood of “Same”
Data: { x0 ; y0 y1 y2 } directions
2-way matching using the new model
10/7/2012
52
Tamás Budavári
Likelihood of “Same”
Data: { x0 ; y0 y1 y2 } directions
2-way matching using the new model
10/7/2012
53
Tamás Budavári
Example
Core match
But lobes?
overlay0113.png
54
10/7/2012
Tamás Budavári
Example
Core match
But lobes?
overlay0113.png
55
10/7/2012
Tamás Budavári
Example
Core match
But lobes?
overlay0113.png
56
10/7/2012
Tamás Budavári
Competing Hypotheses
Many to choose from
Which radio detection is the core? (closest ;-)
Which are separate objects?
10/7/2012
57
Tamás Budavári
Example
Competing hypotheses w/ 2 different tophat priors
10/7/2012
58
None None None : 0.0 0.0 (F)Core None None : 11.1 11.1 (F)Core Lobe None : 19.0 18.3 (F)Core Lobe Lobe : 24.2 23.5 (F)None Lobe Lobe : 13.2 12.5 (F)None None Lobe : 7.9 7.2 (F)
Crossmatching the Hubble Legacy Archive
Hubble Source Catalog
10/7/2012
59
Tamás Budavári
Uncertain Positioning
10/7/2012
60
Exceptional accuracy with images
Large uncertainty in positioning
Due to few standards in the tiny field of view
Solve for relative corrections
Across overlapping observations
Tamás Budavári
Rotation in 3D
Describes both translation and rotation locally
Need constrained numerical optimization: RTR=I
Astrometric Correction
10/7/2012
61
Tamás Budavári
Optimization without constraints Displacement by the cross product operator
Fast computation for the rotation vector ω Just sum up a 3×3 matrix and vector
Solve the linear equation:
Infinitesimal Rotation62
10/7/2012
Tamás Budavári
SQL pipeline
Astrometriccorrection
Subpixelprecision
Available now!
Hubble Source Catalog
10/7/2012
63
RELEASE ON MASTTB & Lubow (2012)
Tamás Budavári
Summary
10/7/2012
64
Bayesian approach to cross-identification
Places former heuristics on a firm statistical basis
Enables us to properly include
Physics, geometry, etc…
Naturally extends to time-domain
Events, proper motion, lightcurves
Opens the door for next-generation methods