Post on 22-Dec-2015
Introduction to Power Spectrum Estimation
Lloyd Knox (UC Davis)
CCAPP, 23 June 2010
Goal of Talk
Take someone who is starting from zero in power spectrum estimation to where they have some intuition for what the issues are, and they know where to go in the literature to begin estimating power spectra in practice.
Outline
• Motivating the use of the power spectrum
• Estimation Under Ideal Conditions
• Impact of various non-idealities
• Estimation under non-ideal conditions
Power Spectra: Useful for studying statistical properties of statistically
homogeneous random fields
• Statistical homogeneity: statistical properties of the field are independent of location.
• Examples: CMB temperature maps, cosmic shear maps, galaxy number count maps*, …
*cosmological evolution actually breaks homogeneity in radial direction, but one can study 2-D slices, or try to correct 3-D map for evolution
Power Spectra Examples
QuickTime™ and a decompressor
are needed to see this picture.
Map at 150 GHz
+
Map at 220 GHz
Random field(s)Power spectrum/spectra
Data plus modeled contributions from four different statistically isotropic components (Hall et al. 2010)
Nol
ta e
t al.
(200
9)
Power Spectrum Example
QuickTime™ and a decompressor
are needed to see this picture.
Random fieldPower spectrum
T(,) = lmalm Ylm(,)
Cl ll’mm’ = <alma*l’m’>
Power spectrum
}Consequence of statistical homogeneity (isotropy in this case)
<…> = ensemble average
Nol
ta e
t al.
(200
9)
Power Spectrum Interpretation
<== Large angular scales small angular scales ==>
T(,) = lmalm Ylm(,)
C() = <T(,) T(’,’) > =l (l+1/2)/(2) Cl Pl(cos())
2 = C(0) = l (l+1/2)/(2) Cl = s d(lnl) l(l+1/2)Cl/(2) }
Contribution to variance from a logarithmic interval in l
Cl = <alma*lm>
Nol
ta e
t al.
(200
9)
Why is the power spectrum useful?
• For Gaussian homogeneous random fields, it captures all the information not in the mean.
• Even for non-Gaussian fields, it can be a highly informative statistic. There will be additional information in other statistics, but the power spectrum is usually a sensible place to start.
Why Cl Instead of the Correlation Function, C()?
• They are linear transformations of each other, carrying the same information.
• For Gaussian fields, the covariance structure of power spectrum estimates is much simpler.
• For linear perturbation theory, time evolution of a single Fourier mode is simple and decoupled from other modes ==> simple physical interpretation of the power spectrum.
• Nonlinearity of evolution, and/or non-Gaussianity, weakens these two advantages.
C() =l (l+1/2)/(2) Cl Pl(cos())
PS Estimation: Simplest Case of Uniform Full-sky Coverage with no noise
alm = s d T Ylm
alm = alms
signal
Cl = m |alm|2/(2l+1)
Each alm provides an unbiased estimate of Cl. For each l there are 2l+1 values of m so we can average them all together to get
^ This is both the minimum-variance and maximum-likelihood estmator.
Note that despite no noise, there is uncertainty in the true value of Cl
<(Cl - Cl)2> = 2/(2l+1)(Cl)2^
PS Estimation: Uniform Full-sky Coverage With Noise
alm = s d T Ylm
alm = alms + alm
n
signal noise
If noise is uncorrelated from pixel to pixel and homogeneous, then <|an
lm|2> = w-1 where w is the statistical weight per solid angle, w = (1/2
pix)/pix , and this “noise bias” needs to be subtracted from our estmate:
Cl = m |alm|2/(2l+1) - w-1
<(Cl - Csl)2> = 2/(2l+1)(Cs
l +w-1)2
^^
PS Estimation: Uniform Full-sky Coverage With Noise and Finite Resolution
alm = s d T Ylm
alm = alms + alm
n
signal noise
Convolution of the sky signal with the response function of the telescope, B(,), is a multiplication in the spherical harmonic domain by Bl = s d Yl0 B(,). We need to compensate by dividing the map alm by Bl so that
Cl = m |alm|2/(2l+1)Bl-2 -Bl
-2w-1
<(Cl - Csl)2> = 2/(2l+1)(Cs
l +Bl-2w-1)2
^^
WMAP Power Spectrum Errors
<(Cl - Csl)2>1/2 = [2/(2l+1)]1/2(Cs
l +Bl-2w-1)
Few samples per l value; i.e., [2/(2l+1)] factor large
Beam-deconvolved noise large
PS Estimation with Partial Sky Coverage, Finite Resolution and Inhomogeneous Correlated Noise
One approach:Optimal methods (ssuming Gaussian random field)
P(T | Cl) \propto M-1/2 exp(-Ti M-1ij Tj/2) with Mij = S ij(Cl) + Nij
By Bayes’ Theorem
P(Cl | T) \propto P(T | Cl)
\propto M-1/2 exp(-Ti M-1ij Tj/2)
But calculation is computationally intractable for maps greater than tens to hundreds of thousands of pixels
Quadratic estimator, likelihood approximations, Gibbs sampling +
Blackwell-Rao Estimator (see references at end)
PS Estimation with Partial Sky Coverage, Finite Resolution and Inhomogeneous Correlated Noise
Another approach: Pseudo-Cl methods
Sub-optimal, but good enough and fast
Basic idea is to use the simple estimator, and then a combination of analytic and Monte Carlo methods to estimate the offset and gain relating the simple estimator (the pseudo-Cl) and the real Cl.
Pseudo-Cl
QuickTime™ and a decompressor
are needed to see this picture.
Random fieldPower spectrum
T(,) = lmalm Ylm(,)
alm = s d Ylm(,) [W(,) T(,)]
W = mask that’s zero in galactic plane, and smoothly goes to one outside of it
Multiplication in real space is convolution in Fourier space
~
alm will have contributions from al’m’ for l’ near l
Pseudo-Cl
That convolution has an analytically calculable effect on the ensemble average of the pseudo-Cl
<Cl> = l’Mll’ Bl’2 Cl’ + <Nl>
~ ~
Effect of mask BeamNoise bias
Noise bias can be calculated via noise-only Monte-Carlo simulations
Estimate Cl by subtracting noise-bias and then deconvolving.
Estimate Cl errors by noise + signal Monte-Carlo simulation
Eliminating Noise Bias
<Cl> = l’Mll’ Bl’2 Cl’ + <Nl>
~
~Form alm from two different maps, each with noise, but noise that is not correlated from one map to the next.
Reduces sensitivity to knowing noise level imperfectly.
~
0
Zoom in on 2 mm map~ 4 deg2 of actual SPT data
In addition to large-scale masks (due to partial sky coverage, or the galaxy) need to mask point sources too!
Lots of bright emissive sources
~15-sigma SZ cluster detectionAll these “large-scale”
fluctuations are primary CMB.
Zoom in on 2 mm map~ 4 deg2 of actual data
Point-source Masking
T(,) = lmalm Ylm(,)
alm = s d Ylm(,) [W(,) T(,)]
W = mask that’s zero near a point source and smoothly goes to 1 away from point source
Multiplication in real space is convolution in Fourier space
~
If mask is over very small area, alm will have contributions from al’m’ for l’ far from l
The resulting transfer of power over large l can cause problems.
Use fat masks (very simple) or prewhiten your data
References
Quadratic estimator: Bond, Jaffe & Knox (1998)
Approximate likelihoods: Bond, Jaffe & Knox (2000), Verde et al. (2003)
Gibbs sampling: Wandelt et al. (2004), Eriksen et al. (2004), Chu et al. (2005)
Pseudo-Cl method: Hivon et al. (2002)
Point Source Masking and Pre-whitening: Das, Hajian & Spergel (2009)
Summary
• The power spectrum is a very useful summary statistic for comparing data with theory.
• Optimal estimation, assuming Gaussianity, is difficult and for most applications (not all) it is also pointless.
• Approximate and fast schemes exist that handle a variety of non-idealities -- in principle, via Monte Carlo, can handle them all.
Power Spectra Examples
Nine shear maps: 8 from galaxies in eight photometric redshift bins and one reconstructed from the CMB
Song & Knox 2003
Some of the 9*(9+1)/2 = 45 power spectra
Random fields
0.2-0.2
3.0-3.0
1100-1100