Jul 9, 2008 - ãX, Ylmã2,. (1) where (Ylm) denote the usual spherical harmonics, also is the maximum likelihood estimator of the power spectrum of ...

1 downloads 15 Views 1MB Size

No documents

arXiv:0807.1113v2 [astro-ph] 9 Jul 2008

Laboratoire AstroParticule et Cosmologie, UMR 7164, Universit´e Paris 7 - Denis Diderot and CNRS, 10, rue A. Domon et L. Duquet, 75205 Paris Cedex 13, France (Dated: July 9, 2008) Observations of the Cosmic Microwave Background (CMB) provide increasingly accurate information about the structure of the Universe at the recombination epoch. Most of this information is encoded in the angular power spectrum of the CMB. The aim of this work is to propose a versatile and powerful method for spectral estimation on the sphere which can easily deal with non-stationarity, foregrounds and multiple experiments with various specifications. In this paper, we use needlets (wavelets) on the sphere to construct natural and efficient spectral estimators for partially observed and beamed CMB with non stationary noise. In the case of a single experiment, we compare this method with Pseudo-C` methods. The performance of the needlet spectral estimators (NSE) compares very favorably to the best Pseudo–C` estimators, over the whole multipole range. On simulations with a simple model (CMB + uncorrelated noise with known variance per pixel + mask), they perform uniformly better. Their distinctive ability to aggregate many different experiments, to control the propagation of errors and to produce a single wide-band error bars is highlighted. The needlet spectral estimator is a powerful, tunable tool which is very well suited to angular power spectrum estimation of spherical data such as incomplete and noisy CMB maps.

Introduction

The estimation of the temperature and polarization angular power spectra of the Cosmic Microwave Background (CMB) is a key step for estimating the cosmological parameters. Cosmological information is encoded in the huge data sets (time order scanning data or high resolution maps) provided by ground-based, balloon-borne or satellite experiments. In the ideal case of noiseless and full sky experiments, angular power spectrum estimation is a straightforward task. The empirical spectrum of the outcome of a Gaussian stationary field X, given by b` = C

` X 1 hX, Y`m i2 , 2` + 1

(1)

m=−`

where (Y`m ) denote the usual spherical harmonics, also is the maximum likelihood estimator of the power spectrum of X. It is efficient in the sense that its variance reaches the Cram´er-Rao lower bound. CMB maps are however more or less strongly contaminated by foregrounds and instrumental noises, depending

∗ Laboratoire

Paul Painlev´ e, UMR 8524, Universit´ e Lille 1 and CNRS, 59 655 Villeneuve d’Ascq Cedex, France; Electronic address: [email protected] † MODAL’X, Universit´ e Paris Ouest – Nanterre La D´ efense, 200 avenue de la R´ epublique, 92001 Nanterre Cedex, France and Laboratoire de Probabilit´ es et Mod` eles Al´ eatoires, UMR 7599, Universit´ e Paris 7 - Denis Diderot and CNRS, 175 rue du Chevaleret, 75013 Paris, France ‡ Laboratoire de Traitement et Communication de l’Information , UMR 5141, T´ el´ ecom ParisTech and CNRS, 46 rue Barrault, 75634 Paris Cedex, France

on the wavelength, angular frequency ` and the direction of observation. Ground-based experiments cover small parts of the sky while space missions (COBE, W-MAP and, in the near future, Planck) provide full sky maps of the CMB, but still contaminated with galactic residuals. Then, the plain estimate (1) is no longer efficient nor even unbiased. To circumvent the non-stationarity of actual observations, the main ingredients for the spectral estimation used by, for instance, the W-MAP collaboration [14, 7] and also in most other analysis, are broadly the following ones. Usually, some part of the covered sky is blanked to remove the most emissive foregrounds or the most noisy measurements. This amounts to applying a mask or more generally a weight function to the sky. Most of the emissive foregrounds can be subtracted using some component separation procedure (see e.g. [18] for comparison methods with Planck-like simulated data). Even the best foreground-subtracting maps require masking a small fraction of the sky. Missing or masked data makes the optimal estimation of the power spectrum a much harder task. In particular, it breaks the diagonal structure of the covariance of the multipole moments a`m := hX, Y`m i of any stationary component. Maximum-likelihood estimation of the spectrum in the pixel domain has a numerical complexity that scales as 3 2 Npix and requires the storage of a Npix matrices. This is untractable for high resolution experiments such as WMAP or Planck (Npix ' 13.106 ). Nevertheless, for very low `’s (` ≤ 30), ML estimation in the pixel domain can be performed on downgraded resolution maps; see [5, 30], for instance. At higher `’s, a sub-optimal method based on the Pseudo-C` (PCL) gives quite satisfactory results in terms of complexity and accuracy [17]. It debiases the empirical or (pseudo) spectrum from the noise contribution and deconvolves it from the average mask ef-

2 fect. It works in the spherical harmonic domain, uses fast 3/2 spherical harmonic (SH) transforms and scales as Npix . The available pixels can be weighted according to the signal to noise ratio (SNR) at any given point. For signaldominated frequencies (low `’s), the data are uniformly weighted; it yields the Pseudo-C` estimator with uniform weights (PCLU). At noise-dominated frequencies (high `’s), each pixel is weighted by the inverse of the variance of the noise (PCLW estimator). The W-MAP collaboration used uniform weights for ` ≤ 500, the inverse of the noise variance for ` > 500 (see [14, Section 7.5]) for its three-year release. Efstathiou (2004) showed that the PCLW estimator is statistically equivalent to the ML estimator in the low SNR limit, which is usually the case at high `’s. In the same paper he proposed an hybrid method with a smooth transition between the two PCL regimes. Finally, when several maps are available, it is worth considering cross-power spectra between different channels since noise is usually uncorrelated from channel to channel (see [15, A1.1] or [25]). Other estimation procedures do not fit in any of the two categories above. Among them, the spectral estimation from time ordered data by [31] or Gibbs sampling and Monte Carlo Markov chain methods such as MAGIC or Commander, see [10]. Those last methods try to estimate the complete posterior joint probability distribution of the power spectrum through sampling, which in turn can provide point estimates of the spectrum but also covariance estimates, etc. Recently, the multi-taper approach has been imported from the time series literature to the field of spherical data by [6, 32]. The goal of this approach is to provide an estimation of a localized power spectrum, in a noiseless experiment. In this paper, we focus on spectral estimation of the global power spectrum, in a frequentist framework. We consider spectral estimation at small angular scales, i.e. in the range of multipoles where the cost of ML estimation is prohibitive. We compare our method to PCL methods. We adopt somehow realistic models that include partial coverage of the sky, symmetric beam convolution, inhomogeneous and uncorrelated additive pixel noise and multiple experiments. Localized analysis functions such as wavelets are natural tools to tackle non-stationarity and missing data issues. There are different ways to define wavelets (in the broad sense of space-frequency objects) on the sphere, and our choice is to use the needlets, the statistical properties of which have received a recent rigorous treatment ([1, 2, 3]) and which have already been applied successfully to cosmology ([8, 19, 24]). Needlets benefit from perfect (and freely adjustable) localization in the spherical harmonic domain, which enables their use for spectral estimation. Moreover, the correlation between needlet coefficients centered on two fixed directions of the sky vanishes as the scale goes to infinity, i.e. as the needlet concentrate around those points. The spatial localization is excellent. This property leads to several convergence results and motivates procedures

based on the approximation of decorrelation between coefficients. In this contribution, we define and study a new angular power spectrum estimator that uses the property of localization of the wavelets in both spatial and frequency domains. In the case of a single experiment with partial coverage and inhomogeneous noise, the needlet-based estimator deals straightforwardly with the variation of noise level over the sky, taking advantage of their localization in the pixel domain. Moreover, it allows a joint spectral estimation from multiple experiments with different coverages, different beams and different noise levels. The proposed method mixes observations from all experiments with spatially varying weights to take into account the local noise levels. The resulting spectral estimator somehow mimics the maximum likelihood estimator based on all the experiments. The paper is organized as follows. In Section I, we present the observation model and recall the basics of needlet analysis and the properties of the needlet coefficients which are the most relevant for spectral estimation. In Section II, we define the needlet spectral estimators (NSE) in the single-map and multiple-maps frameworks. In Section III, we present results of Monte Carlo experiments which demonstrate the effectiveness of our approach. In Section IV, we summarize the strong and weak points of our method and outline the remaining difficulties.

I. A.

FRAMEWORK Observation model

Let T denote the temperature anisotropy of the CMB emission. For the sake of simplicity, we consider the following observation model X(ξk ) = W (ξk ) ((B ∗ T )(ξk ) + σ(ξk )Zk ) , k = 1, . . . , Npix

(2)

where (ξk ) is a collection of pixels on the sphere, W denotes a (0-1)-mask or any weight function 0 ≤ W ≤ 1, B denotes the instrumental beam. An additive instrumental noise is modelled by the term σ(ξk )Zk with the assumption that (Zk ) is an independent standard Gaussian sequence. Further, we assume that σ, W and B are known deterministic functions and that B is axisymmetric. Typically, the variance map σ 2 writes σ 2 (ξk ) = σ02 /Nobs (ξk ), where Nobs is referred to as the hit map, that is, Nobs (ξk ) is the number of times a pixel in direction ξk is seen by the instrument. We assume that the observations have been cleaned from foreground emissions or that those emissions are present but negligible outside the masked region. When observations from several

3 experiments are jointly considered, the model becomes Xe (ξk ) = We (ξk ) (Be ∗ T (ξk ) + σe (ξk )Zk,e )) k = 1, . . . , Npix , e = 1, . . . , E

(3)

where e indexes the experiment. The CMB sky temperature T is the same for all experiments but the instrumental characteristics (beam, coverage) differ (see for example Table II), and the respective noises can usually be considered as independent.

Double arrows denotes as many operations (e.g. spherical transforms) as bands. The initial resolution must be fine enough to allow an exact computation of the Spher(j) ical Harmonics Transform up to degree `max . If, say, the HEALPix pixelization is used, the nside parameter of the original map determines the highest available multipole moments and, in turn, the highest available band j.

C. B.

Definition and implementation of a needlet analysis

We recall here the construction and practical computation of the needlet coefficients. Details can be found in Guilloux et al. (2007; see also Narcowich et al., 2006). Needlets are based in a decomposition of the spectral domain in bands or ‘scales’ which are traditionally index (j) by an integer j. Let b` be a collection of window functions in the multipole domain, with maximal frequen(j) cies `max (see Figure 1 below). Consider some pixeliza(j) (j) tion points ξk , k = 1, . . . , Npix , associated with positive (j)

(j)

weights λk , k = 1, . . . , Npix which enable exact discrete integration (quadrature) for spherical harmonics up to (j) degree 2`max , that is, equality

Distribution of the needlet coefficients

A square-integrable random process X on the sphere is said to be centered and stationary (or isotropic) if E(X(ξ))P = 0, E(X(ξ)2 ) < ∞ and E(X(ξ)X(ξ 0 )) = (4π)−1 ` C` L` (ξ · ξ 0 ), with C` referred to as the angular power spectrum of X. The next proposition summarizes the first and second order statistical properties of the needlet coefficients of such a process. They are the building blocks for any subsequent spectral analysis using needlets. Proposition 1 Suppose that X is a stationary and centered random field with power spectrum C` . Then the needlet coefficients are centered random variables and, for any 4-tuple (j, j 0 , k, k 0 ) (j)

(j 0 )

cov[γk , γk0 ] =

(j)

Y`m (ξ)dξ = S

X

(j) (j 0 )

b` b` C` L` (cos θ)

(6)

`≥0

Npix

Z

X

(j)

(j)

λk Y`m (ξk )

where θ = θ(j, k, j 0 , k 0 ) is the angular distance between (j) (j 0 ) ξk and ξk0 . In particular

k=1 (j)

holds for any `, m such that ` ≤ 2`max , |m| ≤ `. Needlets are the axisymmetric functions defined by (j) ψk (ξ)

`(j) q max X (j) (j) (j) = λk b` L` (ξ · ξk ),

C (j) := (4π)−1

where L` denote the Legendre polynomial of order ` normalized according to the condition L` (1) = 2`+1 4π . For (j) proper n o choices of window functions {b` }j , the family (j)

is a frame on the Hilbert space of square-

k,j

integrable functions on the sphere S. In a B-adic scheme, it is even a tight frame [22]. Though redundant, tight frames are complete sets which have many properties reminiscent of orthonormal bases (see e.g. [7], chap.3). (j) For any field X on the sphere, the coefficients γk := (j) (j) (λk )−1/2 hX, ψk i are easily computed in the spherical harmonic domain as made explicit by the following diagram {X(ξk )}k=1,...,Npix n o (j) γk )

(j)

k=1,...,Npix

SHT

X

(j)

b`

2

(2` + 1)C` .

(8)

`≥0 (j)

In other words, the variance of the coefficients γk is the power spectrum of X properly integrated over the j-th band. Remark 1 It also follows from (6) that if the bands j and j 0 are non-overlapping (this is the case for any nonconsecutive filters of Figure 1), all the pairs of needlet (j) (j 0 ) coefficients γk and γk0 are uncorrelated and then independent if the field is moreover Gaussian. Suppose now that X(ξk ) = σ(ξk )Zk , k = 1, · · · , Npix ,

−→

a`m ⇓×

SHT−1

(j) b` a`m

⇐=

(7)

where (4)

`=0

ψk

(j)

var[γk ] = C (j) .

(5)

is a collection of independent random variables with zero mean and variance σ 2 (ξk ), where σ is a band-limited function. This is a convenient and widely used model

4 for residual instrumental noise (uncorrelated, but nonstationary). Needlet coefficients are centered and, moreover, if the quadrature weights are approximately uniform (λk ' 4π/Npix , as is the case of HEALPix) and σ is sufficiently smooth, then Z (j) (j 0 ) (j) (j 0 ) cov[γk , γk0 ] ' σ 2 (ξ)ψk (ξ)ψk0 (ξ)dξ.

In the following, the beam effect for spectral estimation is taken into account in each band. Indeed, with Definition (11), with no noise, no mask and a smooth beam, Eq. (7) translates to " (j) # γk (j) var ' C (j) , k = 1, . . . , Npix . (12) B (j)

S

We denote Z

(j)

nk (σ) :=

(j)

σ 2 (ξ)|ψk (ξ)|2 dξ

1/2 .

(9)

S

the standard deviation of the needlet coefficient of scale j centered on ξk . When the noise is homogeneous (σ is 2 2 P (j) constant), it reduces to Nσpix `≥0 (2` + 1) b` . D.

Mask and beam effects

As already noticed in the Introduction, missing or masked data makes the angular power spectrum estimation a non trivial task. Simple operations in Fourier space such as debeaming become tricky. Needlets are also affected by both the mask and the beam. The effect on needlets of beam and mask can be approximated as described below. These approximations, which lead to simple implementations, are validated in numerical simulations in relatively realistic conditions in Section III. a. Mask. Recalling that the needlets are spatially localized, the needlet coefficients are expected to be insensitive to the application of a mask on the data if they are computed far away from its edges. Numerical and theoretical studies of this property can be found in [2] and [13]. In practice, we choose to quantify the effect of (j) the mask on a single coefficient γk by the loss induced (j) on the L2 -norm of the needlet ψk , i.e. a purely geometrical criterion. More specifically, needlet coefficients at scale j are deemed reliable (at level t(j) ) if they belong to the set ( ) (j) kW ψk k22 (j) (j) (j) Kt(j) := k = 1, . . . , Npix : ≥t (10) (j) kψk k22 Parameter t(j) is typically set to 0.99 or 0.95 for all bands. (j) (j) Note that t(j) 7→ Kt is decreasing, K0 = K (j) and (j) K1+ = ∅. In practice, this set is computed by thresholding the map obtained by the convolution of the mask with P 2 (j) the axisymmetric kernel ξ 7→ b L (cos θ) . This ` ` ` operation is easy to implement in the multipole domain. b. Beam. Consider now the effect of the instrumental beam. Its transfer function B` is assumed smooth enough that it can be approximated in the band j by its mean value B (j) in this band, defined by X (j) (B (j) )2 := (4π)−1 (2` + 1)(b` )2 B`2 . (11) `≥0

In other words, thanks to the relative narrowness of the bands and to the smoothness of the beam and CMB spectrum, the attenuation induced by the beam can be approximated as acting uniformly in each band and not on individual multipoles. Numerically, with typical beam values from WMAP or ACBAR experiments (see Table II), the relative difference (statistical bias) between the goal quantity C (j) and the estimated (j) one var(γk )/(B (j) )2 remains under 1% for bands below j = 27 (`max = 875) for WMAP-W, and below j = 39, `max = 2000 for ACBAR.

II.

THE NEEDLET SPECTRAL ESTIMATORS (NSE)

A.

Smooth spectral estimates from a single map (j)

For any sequence of weights wk such that (j) PNpix (j) = 1 and for a clean (contamination-free), k=1 wk complete (full-sky) and non-convolved (beam-free) observation of the CMB, the quantity (j)

Npix

b (j) := C

X

(j)

wk

(j)

γk

2

k=1

is an unbiased estimate of C (j) , a direct consequence of Proposition 1. Remark 2 For uniform weights, this estimator is nothb` from Eq. (1) binned by the wining but the estimator C (j) 2 dow function (b` ) . Indeed, (see diagram (5)) b (j) =(4π)−1 C

` X (j) X (b` )2 a2`m `≥0

=(4π)−1

(13)

m=−`

X (j) b` (b` )2 (2` + 1)C

(14)

`≥0

This is the uniformly minimum variance unbiased estimator of C (j) . The so-called cosmic variance is the Cram´er-Rao lower bound for estimation of the parameter C` in the full-sky, noise-free case. Its expression simply is 2C`2 /(2` + 1). Its counterpart for the binned estimator C (j) in this ideal context is X (j) (j) Vcosmic = 2(4π)−2 (b` )4 (2` + 1)C`2 (15)

5 Consider now the observation model (2). Up to the approximations of Section I D, one finds that X (j) (j) 2 (j) 2 1 b (j) := wk γk − nk . (16) C 2 B (j) (j) k∈Kt

j

is an unbiased estimate of C (j) as soon as X (j) wk = 1.

(17)

(j) j

k∈Kt

The weights can further be chosen to minimize the mean 2 b (j) − C (j) , under the constraint (17). square error E C It amounts to setting the weights according to the local signal-to-noise ratio, which is non constant for non stationary noise. This is a distinctive advantage of our method that it allows for such a weighting in a straightforward and natural manner. In the case of uncorrelated coefficients, this optimization problem is easy to solve (using Lagrange multipliers) and is equivalent to maximizing the likelihood under the approximation of independent coefficients (see Appendix C for details). It leads to the solution

(j)

in the needlet domain, for indexes k ∈ Ke,tj , where Zk,e are standard Gaussian random variables which are correlated within the same experiment e but independent between experiments. As explained in the single experiment case of Section II A, the coefficients are only slightly correlated. This justifies the use, in the angular power spectrum estimator, of the weights derived by the maximization of the likelihood with independent variables. The correlation between coefficients does not introduce any bias here but only causes loss of efficiency. As in the single-experiment case, only the coefficients sufficiently far away from the mask of at least one experiment are kept. Defining (j)

K(j) = ∪e Ke,tj . the aggregated estimator is implicitly defined (see Appendix C) by 2 2 X (j) (j) (j) M L,(j) M L,(j) b b C = w ˜k C − n γ˜k ˜k k∈K(j)

(20) with, for any k in K(j) (j)

(j)

wk (C) 2 −2 h X (j) := C + nk

(j)

γ˜k =

(j)

C + nk 0

2

−2

i−1

(18) (j)

n ˜k

with C = C (j) . This is the unknown quantity to be estimated but it can be replaced by some preliminary estimate (for example the spectral estimate of [14]). One can also iterate the estimation procedure from any starting point. The robustness of this method with respect to the prior spectrum is demonstrated at Section III A 2 (see Figure 6). Those weights are derived under the simplifying assumption of independence of needlet coefficients. They can be used in practice because needlet coefficients are only weakly dependent. Precisely, for two fixed points on a increasingly fine grid ξk , ξk0 and well-chosen window (j) (j) functions, the needlets coefficients (γk , γk0 ) are asymptotically independent as j → ∞ (see [2]). Note that this property is shared by well-known Mexican Hat wavelets, as proved in [21]. Smooth spectral estimates from multiple experiments

Consider now the observation model described by Eq. (3) with noise independent between experiments. Using the approximations of Section I D, Eq. (3) translates to (j)

γk,e (X) (j)

Be

(j)

(j)

= γk (T ) +

nk (σe ) (j)

Be

Zk,e

(j)

ωk,e

e

(j) k0 ∈Kt j

B.

X

(19)

X :=

(j) Be (j) nk (σe )

e

:=

Be (j)

nk (σe )

(21)

(j)

Be

−1/2

!2

(22)

1k∈K(j) e,tj

!2

(j)

(j) ωk,e

γk,e

1k∈K(j)

e,tj

(j)

n ˜k

2

(23)

and similarly to (18), (j)

w ˜k (C) −1 2 −2 2 −2 X (j) (j) := C + n ˜k C+ n ˜ k0 k0 ∈K(j)

(24) P (j) P (j) Note that e ωk,e = 1 and k w ˜k = 1. An explicit estimator is obtained by plugging some previous, possibly (j) rough, estimate C of C (j) in place of C of Eq. (24). Eventually, the aggregated angular power spectrum estimator is taken as X (j) (j) (j) 2 (j) 2 (j) b C = w ˜k C γ˜k − (˜ nk ) . (25) k∈K(j)

This expression can be interpreted in the following way. For any pixel k in K(j) , that is for any pixel where the needlet coefficient is reasonably uncontaminated by the mask for at least one experiment, compute an aggregated (j) needlet coefficient γ˜k by the convex combination (21) of

6 the debeamed needlet coefficients from all available experiments. Weights of the combination are computed according to the relative local signal to noise ratio (including the beam attenuation). Finally, a spectral estimation is performed on the single map of aggregated coefficients, in the same way as in Section II A. Those co(j) efficients are squared and translated by n ˜ k to provide an unbiased estimate of C (j) . Then all the available squared and debiased coefficients are linearly combined according (j) to their relative reliability w ˜k (C) which is proportional (j) to (C (j) + (˜ nk )2 )−1 . Figures 13 and 14 display the val(j) (j) ues of those weights (maps) w ˜k and ωk,e for a particular mixing of experiments. See Section III B for details.

C.

Parameters of the method

In this section, we discuss various issues raised by the choice of the parameters of the NSE method. Those parameters are: the shape of the spectral window func(j) tion b` in each band (or equivalently the shape of the needlet itself in the spatial domain), the bands themselves (i.e. the spectral support of each needlet) and the values of the thresholds tj that define the regions of the sky where needlets coefficients are trusted in each band; see Eq. (10). See Section III A 2 for a numerical investigation.

1.

(1960) for the PSWF in R and [13, 28] for PSWF on the sphere. In our simulations, we use PSWF needlets since they are well localized and easy to compute. Other criteria and needlets can be investigated and optimized, at least numerically; see [13] for details. The choice of the optimal window function in a given band is a non trivial problem which involves the spectrum itself, the characteristics of the noise and the geometry of the mask. Even if we restrict to PSWF as we do here, it is not clear how to choose the optimal opening θ(j) for each band j. We can use several rules of thumb based on approximate scaling relation between roughly B-adic bands and openings θ(j) that preserve some Heisenberg product or Shannon number. Figure 1 represents three families of PSWF needlets that are numerically compared below. Their spatial concentration is illustrated by Figure 2.

Width and shape of the window functions

For spectral estimation, it is advisable to consider spectral window functions with relatively narrow spectral support, in order to reduce bias in the spectral estimation. The span of the summation in (4) can be fixed to (j) (j) some interval [`min , `max ]. For our illustrations, the interval bands have been chosen to cover the range of available multipoles with more bands around the expected positions of the peaks of the CMB. The bands are described in Table I. It is well known however that perfect spectral and spatial localization cannot be achieved simultaneously (call it the uncertainty principle). In order to reduce the effect of the mask, we have to check that the analysis kernels are well localized. This leads to the optimization of some localization criteria. If we retain the best L2 concentration in a polar cap Ωθ(j) = {ξ : θ ≤ θ(j) }, namely (j)

(b` )`=`min ,··· ,`max 2 P`(j) max(j) b(j) L` (ξ) dξ Ωθ(j) `=`min ` = arg max 2 R P`(j) b (j) max (j) b` L` (ξ) dξ S

2.

The choice of the needlets coefficients (mask)

Practically we want to keep as much information (i.e. as many needlet coefficients) as possible, and to minimize the effect of the mask. Using all the needlet coefficients regardless of the mask would lead to a biased estimate of the spectrum. It is still true if we keep all the coefficients outside but still close to the mask, keeping in mind that the needlets are not perfectly localized. On the other hand, getting rid of unreliable coefficients reduces the bias, but increases the variance. This classical trade-off is taken by choosing the threshold level t(j) in the Definition (10) of the excluding zones. For multiple experiments, a different selection rule can be applied to each experiment, according to the geometry of the mask and the characteristics of the beam and the noise.

III.

MONTE CARLO STUDIES

Recall that NSE spectral estimators are designed based on three approximations: • one can neglect the impact of the mask on the needlet coefficients which are centered far enough from its edges; • one can neglect the variations of the beam and the CMB power spectrum over each band. • the weights, which are optimal under the simplifying assumption of independent needlet coefficients, still provide good estimates for the truly weakly correlated needlet coefficients.

R

(26)

`=`min

we obtain the analogous of prolate spheroidal wave function (PSWF) thoroughly studied in e.g. Slepian & Pollak

We carry out Monte Carlo studies to investigate, first the actual performance of the method on realistic data, and second the sensitivity of the method with respect to its parameters. Stochastic convergence results under appropriate conditions is established in a companion paper [11].

7

Prolate 1

Prolate 3

650

0

200

400 Multipole

750

850

600

650

800

0

Prolate 2

200

400 Multipole

600

850

800

Top Hat

650

0

200

750

400 Multipole

750

850

600

650

800

0

200

750

400 Multipole

600

850

800

0.4

25

FIG. 1: Four families of window functions that are used for the NSE and compared numerically in Section III A 2. There are three families of prolate spheroidal wave functions and one family of top-hat functions. All the families are defined on the same bands. Inset graphs show the window function in the 26th band. Each window function is normalized by the relation P (j) (4π)−1 (b` )2 (2` + 1) = 1. Then, if the angular power spectrum is flat, C` ≡ C0 , then C (j) ≡ C0 for all bands, according to (8).

0.0 −0.6

ψ(cos(θ)) 5 10 15

20

Prolate 1 Prolate 2 Prolate 3 Top Hat

0.26

0.27

0.28

0.29

0.30

−10

−5

0

0.25

0.00

0.05

0.10

0.15 θ

0.20

0.25

0.30

FIG. 2: Angular profile of the four needlets associated to the window P functions at the 26th band (701 ≤ ` ≤ 800) from the four families of Figure 1. We have plotted the axisymmetric profile ` b` L` (cos θ) as a function of θ. Needlets with “smoother” associated window profile (such as Prolate 3) need more room to get well localized, but are less bouncing than needlets with abrupt window function (such as top hat or Prolate 1)

8 Band (j) 1 (j) `min (j) `max

.

2

3

4

5 ···

20

21

22

23

24

25

26

27

28

29

···

35

36

37

38

39

2 11 21 31 41 401 451 501 551 601 651 701 751 801 876 1326 1426 1501 1626 1751 ··· ··· 20 30 40 50 60 500 550 600 650 700 750 800 875 950 1025 1475 1625 1750 1875 2000

nside(j) 16 16 32 32 32 · · · 256 512 512 512 512 512 512 512 512 1024 · · · 1024 1024 1024 1024 1024 (j)

θ0

69 50 41 36 32 · · · 10.7 10.2 9.7 9.3 8.9 8.6 8.3 8.0 7.7

7.4

···

6.1

5.8

5.6

5.4

5.2

(j)

TABLE I: Spectral bands used for the needlet decomposition in this analysis. Depending on `max , the needlet coefficient maps are computed using the HEALPix package, at different values of nside, given in the fourth line. The number K (j) of needlet coefficients in band j is then 12(nside(j) )2 . It is a kind of decimated implementation of the needlet transform. The last line gives the opening θ(j) (in degrees) chosen in Eq. 26 to define the PSWF from the Prolate 2 family (see Section III A 2 for details).

A.

Single map with a mask and inhomogeneous noise

In this section, we first consider model (2). According to Eqs (12) and (16), any beam can be taken into account easily in the procedure. Without loss of generality, we suppose here that there is no beam (or B is the Dirac function). The case of different beams is addressed in Section III B, see Table II. The key elements for this numerical experiment are illustrated by Fig. 3. We simulate CMB from the spectrum C` given by the ΛCDM model that best fits the W-MAP data. We use a Kp0 cut [4] for the mask and we take a simple non homogeneous noise standard deviation map (the SNR per pixel is 1.5 in two small circular patches and 0.4 elsewhere). Using a mean-square error criterion, we first study the dependence of NSE performance on its free parameters. Then we compare NSE with methods based on Spherical Harmonic coefficients, known as pseudo-C` estimation and followed in Hinshaw et al. (2006)). For the reader’s convenience, the PCL procedure is summarized in Appendix A.

1.

Mean-square error

b (j) of We shall measure the quality of any estimator C C by its mean-square error (j)

2 b (j) ) = E C b (j) − C (j) MSE(C . This expectation is estimated using 400 Monte Carlo replications. Roughly speaking, the MSE decomposes as an average estimation error and a sampling variance. The estimation error term is intrinsic to the method. Ideally, it should be used to compare the relative efficiency of concurrent approaches. The sampling variance term is the so-called cosmic variance. It is given by the characteristic of the spectrum and coming from the fact that we only have one CMB sky, and thus 2` + 1 a`m ’s to estimate one C` . It is increased by the negative influence of the noise and the mask. This gives an error term

intrinsic to the whole experiment. When the sky is partially observed (let fsky denote the fraction of available sky) and for high `’s (or j’s), the cosmic variance must be divided by a factor fsky leading to the following approximate Cram´er-Rao lower bound at high frequencies (j)

(j)

−1 Vsample = fsky Vcosmic .

(27)

Including an homogeneous additive uncorrelated pixel noise with variance σ 2 , the sample variance writes −1 2fsky

X

(j) (b` )4 (2`

4π 2 + 1) C` + σ Npix

2

In a non-homogeneous context, no close expression for the sampling variance is available: Eq. (27) will serve as one reference. When comparing different window functions in the same band, it must be kept in mind that different estimators do not estimate the same C (j) so that the sampling variances are not the same. In this case, we use the following normalized MSE b (j) ) MSE(C (j)

−1 fsky Vcosmic

2.

(28)

Robustness with respect to parameter choice

This section looks into the robustness of NSE with respect to its free parameters. First and as expected, the spectral estimation is very sensitive to the choice of the window functions. Even if we restrict to the PSWF, one has the freedom to choose a concentration radius θ(j) for each band. We compare the mean-square error of the estimation for various choices of θ(j) that lead to three of the window function families displayed in Figure 1. The second prolate family is obtained using the “rule of thumb” relation (j) (j) θ(j) = 2((`min + `max )/2)−1/2 . The values of those opening angles are in Table I. The first and third sequences of opening angles are the same with a multiplicative factor of 0.5 and 2, respectively. For the sake of comparison we also consider the top-hat window functions. Figure 4 shows the normalized MSE for those four “families” of

1e+02

9

1e−02

theor. Cl CMB empir. Cl noise

0

(a)Mask

(b)Noise standard deviation

(c)One simulated input map

500

1000

1500

(d)Spectra

Relative difference +0 % +5 % +10 % +15 % +20 %

FIG. 3: Simplified model of partially covered sky and inhomogeneous additive noise. This model is used to compare numerically the NSE estimator with PCL estimators and to assess the robustness or the sensitivity of the method with respect to its parameters. The mask is kp0. In CMB µK units, the standard deviation of the uncorrelated pixel noise is 75 in the two small circular patches and 300 elsewhere.

0

200

400

600 Multipole

800

1000

1200

FIG. 4: Comparison of the normalized MSE (28) of the needlet spectral estimators for the four families of spectral window functions displayed in Figure 1. The smoothness of the window function make the MSE smaller at high multipoles. At low multipoles, taking a too smooth function makes the needlet less localized and there is a loss of variance due to the smaller number K(j) of needlet coefficients that are combined.

needlets as a function of the band index. Notice the poor behavior of a non-optimized window function and the far better performance of the second prolate family in comparison with top-hat and Prolate 1 windows. Thus, in the following, we use this particular needlet family to study the sensitivity of the method with respect to the other parameters, and to compare NSE and PCL estimators. Next, for the second family of PSWF, we compare the influence of the threshold value t(j) ≡ t for t = 0.9, 0.95 or 0.98. Figure 5 shows that this choice within reasonable values is not decisive in the results of the estimation procedure. For very low frequencies (` ≤ 100), the conservative choice t = 0.98 increases the variance since many needlets are contaminated by the mask and discarded. Qualitatively, in such variance dominated regimes, taking more coefficients (e.g. t = 0.9) is adequate. However, we

−15 % −10 %

1

−5 %

5

Normalized MSE 10 50

500

Prolate 1 Prolate 2 Prolate 3 Top Hat

t=0.90 t=0.98 0

200

400

600 Multipole

800

1000

1200

FIG. 5: Relative difference between the normalized MSE (28) of the needlet spectral estimation using thresholds 0.9 and 0.98, and the same with threshold 0.95. It highlights the fact that the estimation is not very sensitive to the value of this parameter, except at low `’s, where we do not advocate the use of the NSE. The window function family is “Prolate 2” from Figure 1.

do not advocate the use of the NSE at low `’s where exact maximum likelihood estimation is doable. At higher `’s, there is roughly no difference between the t = 0.95 and t = 0.98 thresholds. Finally, we check the robustness of the method against an imprecise initial spectrum. We take 0.9C (j) and 1.1C (j) as initial values C¯ (j) and compare the results with the best possible initial value which is C (j) itself. The relative difference between the results, displayed in Figure 6, does not exceed 1%.

10 +1 %

have chosen only overlap with their left and right nearest neighbours. The mask induces a spectral leakage, which is however reduced for the smoothest window function. This leakage is however compensated for by the selection (j) of coefficients in Kt(j) (see Eq (10)).

Relative difference +0 % +0.5 %

0.9C(j) 1.1C(j)

−0.5 %

B.

0

200

400

600 Multipole

800

1000

1200

FIG. 6: Robustness of the NSE with respect to the ini¯ (j) given to the weights formula (18). We have tial value C ¯ (j) = 0.9C (j) and performed the whole estimation with C ¯ (j) = 1.1C (j) . This plot shows the relative difference beC tween the normalized MSE under those initial values and the ¯ (j) = C (j) . normalized MSE under the optimal initial value C

3.

Pseudo-C` versus needlet spectral estimator

We compare the NSE estimator given by Eq. (16) with estimation based on the spherical Harmonic coefficients of the uniformly weighted map and of the 1/σ 2 -weighted map. The result is displayed in Fig. 7. As expected, at low multipoles, where the SNR is higher, the uniform pseudo-C` estimator performs better than the weighted pseudo-C` estimator and, conversely, at high multipoles where the SNR is lower. [9] proved that the equal-weights pseudo-C` estimator is asymptotically Fisher-efficient when ` goes to infinity. The behaviour of the needlet estimator is excellent: its performance is comparable to the best of the two previous methods both at low and high multipoles. Thus, there is no need to choose arbitrary boundaries between frequencies for switching for one weighting to the other. The NSE estimator automatically implements a smooth transition between the two regimes and it does so quasioptimally according to noise and mask characteristics. At low `’s one should optimize the window function (the characteristic angle of opening of the prolates) to broaden the range of optimality of the NSE. Providing a C` estimate with error bars is often not sufficient. Estimate the covariance matrix of the whole vector of spectral estimates is necessary for full error propagation towards, say, estimates of cosmological parameters. Figure 8 shows the values of the correlation matrix between the spectral estimates. In the idealistic case of a full sky noiseless experiment, the theoretical correlation matrix is tridiagonal because window functions we

Aggregation of multiple experiments

Historically in CMB anisotropy observations, no single instrument provides the best measurement everywhere on the sky, and for all possible scales. In the early 90’s, the largest scales have been observed first by COBE-DMR, complemented by many ground-based and balloon-borne measurements at higher `. Similarly, ten years later, WMAP full sky observations on large and intermediate scales have been complemented by small scale, local observations of the sky as those of Boomerang, Maxima, ACBAR or VSA. The joint exploitation of such observations has been so far very basic. The best power spectrum is obtained by choosing, for each scale, the best measurement available, and discarding the others. One could, alternatively, average the measurements in some way, but the handling of errors is complicated in cases where a fraction of the sky is observed in common by more than one experiment. Clearly, the data is best used if some method is devised that allows combining such complementary observations in an optimal way. In this section we present the results of a Monte Carlo study to illustrate the benefits of our method of aggregated spectral estimator. We simulate observations following the model (3), with E = 6 observed maps : 3 Kp0-masked maps with beams and noise-level maps according to W-MAP experiment in bands Q, V and W respectively ; 3 maps with uniform noise, observed in patches the size of which are equivalent to BOOMERanG-Shallow, BOOMERanG-Deep and ACBAR observations respectively, and noise levels representative of the sensitivities of those experiments. Table II gives the key features of these experiments. Further details can be found in [4] and [16] for W-MAP, [20] for BOOMERanG, [27] and [26] for ACBAR. However, we do not intend to produce fully-realistic simulations. Basically, no foregrounds are included in simulations (neither diffuse nor point sources) ; for ACBAR only the 3 sky fields of year 2002 are used ; and for W-MAP only one detector is used for each band. Key elements for this numerical experiment are illustrated in Fig. 9-10-11, where we have displayed respectively one random outcome of each experiment according to those simplified models, the maps of local noise levels and power spectra of the CMB and of the experiment’s noise. (26) Fig. 13 displays the maps of the weights ωk,e (the 26th band is the multipole range 700 < ` ≤ 800). According to Eq. (23), all those weights belong to [0,1] and for any fixed position, the sum of the weights over the six experiments is equal to one. Red regions indicates

500

11

2

5

10

MSE 20

50

100

200

Prolate 2 PCLU PCLW

0

200

400

600 Multipole

800

1000

1200

FIG. 7: Comparison of the relative MSE (28) of the two PCL estimators (PCLU for flat weights, PCLW for inverse variance weights) with the NSE, for Prolate family 2. For 300 ≤ ` ≤ 1200, the NSE is uniformly better than the best of the two PCL methods. It should be noted that the NSE may be improved again by optimizing the window profiles and the thresholds t(j) (e.g. by taking a lower threshold for low bands to reduce the variance, see Figure 5).

1.0

1.0

30

30 0.8

25

20

0.6

Bands

Bands

20

15

0.8

25

0.6

15

0.4 10

0.4 10

0.2 5

0.2 5

0.0 5

10

15

20

25

30

0.0 5

10

15

20

25

30

b (1) , . . . , C b (32) ), which entries are defined by Eq. (16). It has been FIG. 8: Absolute value of the correlation matrix of vector (C estimated in the context of Fig. 3 using two different families of window functions and 400 Monte Carlo replicates. This shows the difference between a family of PSWF (left panel) and a family of top-hat windows (right panel).

needlet coefficients which are far better observed in an experiment that in all others. Blue, light blue and orange region are increasing but moderately low weights, showing that outside the small patches of BOOMERanG and ACBAR, most information on band 26 is provided by the channel W of WMAP. On the patches, needlet coefficients from W-MAP are numerically neglected in the combination (21). The debiased, squared, aggregated coefficients 2 2 (26) (26) γ˜k − n ˜k are displayed on the left map from Fig. 14. All those coefficients are approximately (26) unbiased estimators of C (26) . The map of weights w ˜k is displayed on the right of Figure 14. More weight is given to regions which are covered by lower noise experiments. The final estimate is obtained by averaging

the pixelwise multiplication of these two maps. Figure 15 shows the benefit of the aggregation of different experiments, in comparison with separate estimations. In CMB literature, error bars from different experiments are usually plotted on a same graph with different colors. For easier reading, we plot the output of single experiment NSE in separate panels (a,b and c). Panel (d) shows the output of the aggregated NSE, which improve the best single experiment uniformly over the frequency range, thanks to the locally adaptive combination of informations from all expermiments. Figure 12 highlights the cross-correlation between single experiment estimators and the final aggregated estimator. It provides a complementary insight on the relative weight of each experiment in the spectral domain. The W-MAP-like measures are decisive for lower bands,

12

(a)“True” CMB

(b)WMAP-Q

(c)WMAP-V

(d)WMAP-W

(e)BOOMERanG-S

(f)BOOMERanG-D

(g)ACBAR

FIG. 9: Simulated observations from model (3) for the 6 experiments described in Table II, in a small patch around point (-40,-90). The approximate size of the patch is 38×38 degrees.

(a)WMAP-Q

(b)WMAP-V

(c)WMAP-W

(d)BOOMERanG-S

(e)BOOMERanG-D

(f)ACBAR

FIG. 10: Coverage and local pixel noise levels of the six simple experiments described in Table II.

whereas BOOMERanG and ACBAR ones give estimators very much correlated to the aggregated one at higher bands. The aggregated NSE is eventually almost identical to the estimator obtained from ACBAR alone.

IV.

DISCUSSION Complexity

According to (5), the calculation of all the needlet coefficients takes one SHT and jmax inverse SHT, where (j) (j) jmax is the number of bands. The weights wk , w ˜k and (j) ωk,e are obtained using simple operations on maps, so that the overall cost of the (aggregated) NSE scales as 3/2 Npix operations. This is comparable to the cost of the

13 fsky

Given by the hit map

78.57 %

1024

17.5 µK 5.2 µK 14.5 µK

2.80 % 0.65 % 1.62 %

5’

2048a

0.8

10’

a We used nside=1024 for our Monte Carlo simulations, as going to `max ' 2000 is enough to discuss all the features of our method.

1e+02

TABLE II: Main parameters of the experiments to be aggregated. The beams are given in minutes of arc, nside refers to HEALPix resolution of the simulated maps, noise level is either a map computed from a hitmap and an overall noise level, or a uniform noise level per pixel (in µK CMB). Numbers quoted here are indicative of the typical characteristics of observations as those of W-MAP, BOOMERanG and ACBAR, and are used for illustrative purposes only.

1e−01

WMAP−Q WMAP−V WMAP−W BOOM−S BOOM−D ACBAR

Correlation 0.4 0.6

512

0.2

31’ 21’ 13’

ACBAR WMAP BOOMERanG

0.0

W-MAP Q W-MAP V W-MAP W BOOM S BOOM D ACBAR

Noise level

1.0

Experiment Beam nside

0

500

1500

FIG. 12: Correlation between the aggregated estimator and single experiments estimators. This provides insight on the contribution of each experiment into the final aggregated single spectral estimate.

Thus, an unbiased spectral estimator is given by (j) bcross C =

(j)

X

wk

X

(j)

Be(j) Be0

−1

(j) (j)

γk,e γk,e0

(29)

e6=e0

k∈K(j) 1e−04

1000 Multipole

1e−07

(j)

0

500

1000

1500

Multipole

FIG. 11: Spectra of the beamed CMB (with the BOOMERanG lines overplotted) and noise levels (horizontal lines) seen by the six experiments, as if they were full sky (the fsky effect is not taken into account).

PCL methods.

where the weights wk depend on a preliminary estimate of the spectrum and a possibly imprecise estimate of the local and aggregated noise levels that enter in the vari(j) (j) ance of γk,e γk,e0 . This has not been investigated numerically yet but we can conjecture the qualitative results of this approach: more robustness with respect to noise misspecification but greater error bars than the NSE with perfectly known noise levels. Moreover, adapting the procedure described in [25], one can test for noise misspecification, and for the correct removal of the noise b (j) and by considering the difference between the NSE C (j) bcross . cross-spectrum NSE C V.

Sensitivity to the noise knowledge

To be unbiased, the above described estimators require a perfect knowledge of the noise characteristics, as do Pseudo-C` estimators. In both cases, the uncertainty on the noise can be tackled using cross-spectrum, that removes the noise on the average provided that the noises from each experiment are independent. Indeed, for any pixel k far enough from the masks of experiment e and e0 , e 6= e0 , we have (j) (j)

(j)

E[γk,e γk,e0 ] = Be(j) Be0 C (j) .

CONCLUSION

We have presented some potentialities of the needlets on the sphere for the angular power spectrum estimation. This tool is versatile and allows to treat consistently the estimation from a single map or from multiple maps. There remains many ways of improving or modify the method described in Section II. In the future, it is likely that again complementary data sets will co-exist. This is the case, in particular, for polarisation, for which Planck will measure the large scale CMB power on large scales with moderate sensitivity, while ground-based experiments will measure very accurately polarisation on smaller scales. Extensions to

14

(26)

FIG. 13: Method for aggregating experiments: Weights ωk,e for combining the needlet coefficients from the 26th band (700 < ` ≤ 800) and the six experiments. From left to right and top to bottom: W-MAP-Q, W-MAP-V, W-MAP-W, BOOMERanG-S, BOOMERang-D and ACBAR.

FIG. 14: Method for aggregating experiments: On the left: map of debiased squares of aggregated needlet coefficients, in the (j) 26th band (700 < ` ≤ 800). On the right: map of the weights wk affected to those coefficients to estimate the power spectrum.

polarisation of the approach presented hers will likely be important for the best exploitation of such observations.

Acknowledgments

by the Astro-Map and Cosmostat ACI grants of the French ministry of research, for the development of innovative CMB data analysis methods. The results in this paper have been derived using the HEALPix package [12]. Our pipeline is mostly implemented in octave (www.octave.org).

The ADAMIS team at APC has been partly supported APPENDIX A: PSEUDO-C` ESTIMATORS

Let T be a stationary process with power spectrum (C` )`≥0 , W an arbitrary weight function (or mask) and e` (W) = C

` X 1 |hY`m , WT i|2 2` + 1 m=−`

the so-called pseudo-power spectrum of T with mask W. The ensemble-average of this quantity is related to the true power spectrum by the formula e` ) = M``0 (W)C`0 E(C where M``0 (W) is the doubly-infinite coupling matrix associated with W, see [17, 23]. If U is a unit variance white pixel noise, denote by V` ≡ 4πσ 2 /Npix its “spectrum” (see Appendix B). Consider now the model X = W1 T + W2 U .

15

3-sigma error bars

3-sigma error bars

6000

6000 true SpecFusion

true SpecFusion

5000

5000

4000

4000

3000

3000

2000

2000

1000

1000

0

0 0

500

1000

1500

2000

0

500

(a)ACBAR

1000

1500

2000

(b)BOOMERanG

3-sigma error bars

3-sigma error bars

6000

6000 true SpecFusion

true SpecFusion

5000

5000

4000

4000

3000

3000

2000

2000

1000

1000

0

0 0

500

1000

1500

2000

0

500

(c)WMAP

1000

1500

2000

(d)All aggregated

FIG. 15: Results for the aggregated NSE. Error bars are estimated by 100 Monte Carlo simulations. The ACBAR power spectrum is computed using the single map needlet estimator described in Section II A, whereas the BOOMERanG and WMAP spectra are obtained using the aggregation of needlets coefficients from the two (BOOM-S and BOOM-D) and three (W-MAP Q,V,W) maps respectively. The final spectrum (d) is obtained by aggregating all available needlet coefficients from the six maps.

Then, if M``0 (W1 ) is full-rank,

n o e` (W1 ) − M(W2 )``0 V`0 (M``0 (W1 ))−1 C

is an unbiased estimator of C`0 . It is obtained by deconvolving and debiasing the empirical spectrum. The observation model (2) with no beam coincides with the preceding framework with W1 = W and W2 = σW . This leads to the uniform-weights pseudo-C` estimator (PCLU). One can also divide all the observations by σ 2 , yielding to a similar scheme with W1 = σ −2 W and W2 = σ −1 W . This is the variance-weighted pseudo-C` estimator (PCLW). Both are used by the W-MAP collaboration [14]. The uniform weights lead to better estimates in the high SNR regime (low `’s) whereas the flat weights perform better at low SNR (high `’s). [9] showed that the Pseudo-C` estimator is statistically equivalent to the maximum likelihood estimator asymptotically as ` goes to infinity. He also proposed an implementation of a smooth transition between those two regimes.

16

1e−04

5e−04

Normalized MSE 5e−03

5e−02

Aggregated ACBAR WMAP BOOMERanG

0

500

1000 Multipole

1500

FIG. 16: Mean-square error of the three single expermiment NSE estimators and of the aggregated NSE estimator. The 2-sigmas error bars reflect the imprecision in the Monte Carlo estimation of the MSE of the aggregated NSE. Up to those uncertainties, the aggregated estimator is uniformly better than the best of all experiments. The improvement is decisive in ˆ (j) − C (j) )2 /(C (j) )2 . “crossing” regions, where two expermiments perform comparably. The normalized MSE here is E(C

APPENDIX B: WHAT “NOISE SPECTRUM” MEANS

Let ν denote the noise. It is defined on pixels and supposed centered, Gaussian, independent from pixel to pixel, and of variance σ 2 (ξ), i.e. νk = σ(ξk )Uk , k = 1, . . . Npix , P with U1 , . . . , UNpix ∼ N (0, 1). Define ν`m := k λk νk Y`m (ξk ), and call them (abusively) the “discretized” multipole moments of the noise, which do not have any continuous counterpart ν is not defined on the whole sphere. Pbecause 1 2 Define the corresponding discretized empirical spectrum N` := 2`+1 m ν`m , then X E(ν`m ν`0 m0 ) = λ2k σ 2 (ξk )Y`m (ξk )Y`0 m0 (ξk ) i.i.d

k

1 X N` = λk λk0 σ(ξk )σ(ξk0 )Uk Uk0 L` (hξk , ξk0 i) 2` + 1 0 k,k

and E(N` ) =

1 X 2 2 λk σ (ξk ) =: N` 4π k

This sequence N` can be thought of as the pixel-noise spectrum. Note that if λk = N4π , k = 1, . . . , Npix , then pix R 2 2 1 4πσ N` = Npix σ (ξ)dξ. If the noise is moreover homogeneous, σ(ξ) ≡ σ, then E(ν`m ν`0 m0 ) = Npix δ`,`0 δm,m0 . APPENDIX C: VARIANCE ESTIMATION BY AGGREGATION OF EXPERIMENTS WITH INDEPENDENT HETEROSCEDASTIC NOISE

Consider the model Yk,e = Xk + nk,e Zk,e i.i.d

i.i.d

where X := [Xk ]k∈[1,Npix ] and Z := [Zk,e ](k,e)∈[1,Npix ]×[1,E] are independent, Xk ∼ N (0, C), Zk,e ∼ N (0, 1) and the noise standard deviations nk,e are known. This corresponds to the observation of the same signal X by E independent

17 experiments, the observations being tainted by independent but heteroscedastic errors. Let Yk := [Yk,e ](e∈[1,E] be the vector of observations at point (or index in a general framework) k, and let Y := ([YkT ]k∈[1,Npix ] )T be the full vector of observations. The covariance matrix of Yk is Rk := 11T C + Nk where Nk := diag(n2k,e )e∈[1,E] and 1 is the E × 1 vector of ones. By independence of the Yk ’s, the negative log-likelihood of C given Y thus writes X L(C) := −2 log (P (Y|C)) = −2 log (P (Yk |C)) X k = Yk T R−1 k Yk + log det Rk . k

Denote n ˜ k := 1T N−1 k 1

−1/2

=

X e

−2

(nk,e )

−1/2

.

(C1)

It is immediate to check the following identity which will be used below: R−1 k 1=

N−1 k 1 . 1 + Cn ˜ 2k

b k := Yk Yk T . The derivative of the negative log-likelihood writes Define R X ∂Rk −1 −1 ∂Rk L0 (C) = −Yk T R−1 R Y + tr R k k k k ∂C k ∂C X T −1 T −1 = tr −Yk Rk 11 Rk Yk + tr R−1 11T k k X b k R−1 1 = 1T R−1 R − R k k k k −1 T b k N−1 1 X 1 N k Rk − R k = 2 2 k (1 + C n ˜k ) 2 −1 X 1T N 1 X 1T N−1 Yk 2 − 1T N−1 1 k k k =C 2 − 2 2 2 k (1 + C n k ˜k ) (1 + C n ˜k ) 2 −1 T 2 X X −2 ˜ 2k n ˜ k 1 Nk Yk − n =C C +n ˜ 2k − . 2 k k (C + n ˜ 2k ) It follows that the likelihood is maximized for b C = C(w) :=

X k

wk (C, N)

h

n ˜ 2k 1T N−1 k Yk

2

−n ˜ 2k

i

with wk (C, N) := C + n ˜ 2k

−2 hX i

C +n ˜ 2i

−2 i−1

.

(C2)

As the optimal weights depend on C, this only defines implicitly the ML estimator. For some approximate spectrum b w C 0 , the proposed explicit NSE is given by C( bk ) with w bk = wk (C 0 , N). Particular case of a single experiment

In the particular case of a single experiment (E = 1) with heteroscedastic noise, following the model Yk = Xk + nk Zk , the likelihood is maximized for b C = C(w) :=

X k

wk (C, N) Yk2 − n2k

(C3)

18 b w with wk (C, N) defined by Eq. (C2), and again, assuming that wk is poorly sensitive to C, the NSE is C bk (C 0 ) for some approximate spectrum C 0 .

[1] Baldi, P., Kerkyacharian, G., Marinucci, D., and Picard, D. (2007). Subsampling Needlet Coefficients on the Sphere. ArXiv e-prints, 706. [2] Baldi, P., Kerkyacharian, G., Marinucci, D., and Picard, D. (2008a). Asymptotics for spherical needlets. Ann. Statist. To appear. arxiv.org: math.ST/0606154. [3] Baldi, P., Kerkyacharian, G., Marinucci, D., and Picard, D. (2008b). High frequency asymptotics for wavelet-based tests for Gaussianity and isotropy on the torus. J. Multivariate Analysis, 99:606–636. [4] Bennett, C. L., Halpern, M., Hinshaw, G., Jarosik, N., Kogut, A., Limon, M., Meyer, S. S., Page, L., Spergel, D. N., Tucker, G. S., Wollack, E., Wright, E. L., Barnes, C., Greason, M. R., Hill, R. S., Komatsu, E., Nolta, M. R., Odegard, N., Peiris, H. V., Verde, L., and Weiland, J. L. (2003). First-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Preliminary Maps and Basic Results. ApJS, 148:1–27. [5] Bond, J. R., Jaffe, A. H., and Knox, L. (1998). Estimating the power spectrum of the cosmic microwave background. Phys. Rev. D, 57(4):2117–2137. [6] Dahlen, F. A. and Simons, F. J. (2008). Spectral estimation on a sphere in geophysics and cosmology. Geophys. J. Int., page in press. [7] Daubechies, I. (1992). Ten lectures on wavelets, volume 61 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA. [8] Delabrouille, J., Cardoso, J.-F., Le Jeune, M., Betoule, M., Fa¨ y, G., and Guilloux, F. (2008). A full sky, low foreground, high resolution CMB map from WMAP. Submitted to A&A, arxiv:astro-ph/0807.0773. [9] Efstathiou, G. (2004). Myths and truths concerning estimation of power spectra: the case for a hybrid estimator. Monthly Notices of the Royal Astronomical Society, 349(2):603–626. [10] Eriksen, H. K. et al. (2004). Power spectrum estimation from high-resolution maps by Gibbs sampling. Astrophys. J. Suppl., 155:227–241. [11] Fa¨ y, G. and Guilloux, F. (2008). Consistency of a needlet spectral estimator on the sphere. [12] G´ orski, K., Hivon, E., Banday, A., Wandelt, B., Hansen, F., Reinecke, M., and Bartelmann, M. (2005). HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere. Astrophys. J., 622:759–771. Package available at http://healpix.jpl.nasa.gov. [13] Guilloux, F., Fa¨ y, G., and Cardoso, J.-F. (2008). Practical wavelet design on the sphere. Appl. Comput. Harmon. Anal. (In press) http://dx.doi.org/10.1016/j.acha.2008.03.003. [14] Hinshaw, G., Nolta, M., Bennett, C., Bean, R., Dore, O., Greason, M., Halpern, M., Hill, R., Jarosik, N., Kogut, A., Komatsu, E., Limon, M., Odegard, N., Meyer, S., Page, L., Peiris, H., Spergel, D., Tucker, G., Verde, L., Weiland, J., Wollack, E., and Wright, E. (2007). Three-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Temperature Analysis. Astrophys. J. Supp. Series, 170:288–334. [15] Hinshaw, G., Spergel, D. N., Verde, L., Hill, R. S., Meyer, S. S., Barnes, C., Bennett, C. L., Halpern, M., Jarosik, N., Kogut, A., Komatsu, E., Limon, M., Page, L., Tucker, G. S., Weiland, J. L., Wollack, E., and Wright, E. L. (2003). First-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: The Angular Power Spectrum. Astrophys. J., 148:135–159. [16] Hinshaw, G., Weiland, J. L., Hill, R. S., Odegard, N., Larson, D., Bennett, C. L., Dunkley, J., Gold, B., Greason, M. R., Jarosik, N., Komatsu, E., Nolta, M. R., Page, L., Spergel, D. N., Wollack, E., Halpern, M., Kogut, A., Limon, M., Meyer, S. S., Tucker, G. S., and Wright, E. L. (2008). Five-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Data Processing, Sky Maps, and Basic Results. arXiv:0712.1148. [17] Hivon, E., G´ orski, K. M., Netterfield, C. B., Crill, B. P., Prunet, S., and Hansen, F. (2002). MASTER of the Cosmic Microwave Background anisotropy power spectrum: A fast method for statistical analysis of large and complex Cosmic Microwave Background data sets. The Astrophysical Journal, 567:2–17. [18] Leach, S. M., Cardoso, J. ., Baccigalupi, C., Barreiro, R. B., Betoule, M., Bobin, J., Bonaldi, A., de Zotti, G., Delabrouille, J., Dickinson, C., Eriksen, H. K., Gonz´ alez-Nuevo, J., Hansen, F. K., Herranz, D., LeJeune, M., L´ opez-Caniego, M., Martinez-Gonz´ alez, E., Massardi, M., Melin, J. ., Miville-Deschˆenes, M. ., Patanchon, G., Prunet, S., Ricciardi, S., Salerno, E., Sanz, J. L., Starck, J. ., Stivoli, F., Stolyarov, V., Stompor, R., and Vielva, P. (2008). Component separation methods for the Planck mission. ArXiv e-prints, 805. [19] Marinucci, D., Pietrobon, D., Balbi, A., Baldi, P., Cabella, P., Kerkyacharian, G., Natoli, P., Picard, D., and Vittorio, N. (2008). Spherical Needlets for CMB Data Analysis. M.N.R.A.S., 383(2):539–545. [20] Masi, S., Ade, P. A. R., Bock, J. J., Bond, J. R., Borrill, J., Boscaleri, A., Cabella, P., Contaldi, C. R., Crill, B. P., de Bernardis, P., de Gasperis, G., de Oliveira-Costa, A., de Troia, G., di Stefano, G., Ehlers, P., Hivon, E., Hristov, V., Iacoangeli, A., Jaffe, A. H., Jones, W. C., Kisner, T. S., Lange, A. E., MacTavish, C. J., Marini Bettolo, C., Mason, P., Mauskopf, P. D., Montroy, T. E., Nati, F., Nati, L., Natoli, P., Netterfield, C. B., Pascale, E., Piacentini, F., Pogosyan, D., Polenta, G., Prunet, S., Ricciardi, S., Romeo, G., Ruhl, J. E., Santini, P., Tegmark, M., Torbet, E., Veneziani, M., and Vittorio, N. (2006). Instrument, method, brightness, and polarization maps from the 2003 flight of BOOMERanG. A&A, 458:687–716.

19 [21] [22] [23] [24] [25] [26]

[27]

[28] [29] [30] [31] [32]

Mayeli, A. (2008). Asymptotic uncorrelation for mexican needlets. (Preprint) arxiv : 0806.3009. Narcowich, F., Petrushev, P., and Ward, J. (2006). Localized tight frames on spheres. SIAM J. Math. Anal., 38(2):574–594. Peebles, P. J. E. (1973). Statistical analysis of catalogs of extragalactic objects. I. Theory. Astrophys. J., 185:413–440. Pietrobon, D., Balbi, A., and Marinucci, D. (2006). Integrated Sachs-Wolfe effect from the cross correlation of WMAP3 year and the NRAO VLA sky survey data: New results and constraints on dark energy. Phys. Rev. D, 74. Polenta, G., Marinucci, D., Balbi, A., de Bernardis, P., Hivon, E., Masi, S., Natoli, P., and Vittorio, N. (2005). Unbiased estimation of an angular power spectrum. Journal of Cosmology and Astroparticle Physics, 2005(11):001. Reichardt, C. L., Ade, P. A. R., Bock, J. J., Bond, J. R., Brevik, J. A., Contaldi, C. R., Daub, M. D., Dempsey, J. T., Goldstein, J. H., Holzapfel, W. L., Kuo, C. L., Lange, A. E., Lueker, M., Newcomb, M., Peterson, J. B., Ruhl, J., Runyan, M. C., and Staniszewski, Z. (2008). High resolution CMB power spectrum from the complete ACBAR data set. ArXiv e-prints, 801. Runyan, M. C., Ade, P. A. R., Bhatia, R. S., Bock, J. J., Daub, M. D., Goldstein, J. H., Haynes, C. V., Holzapfel, W. L., Kuo, C. L., Lange, A. E., Leong, J., Lueker, M., Newcomb, M., Peterson, J. B., Reichardt, C., Ruhl, J., Sirbi, G., Torbet, E., Tucker, C., Turner, A. D., and Woolsey, D. (2003). ACBAR: The Arcminute Cosmology Bolometer Array Receiver. ApJS, 149:265–287. Simons, F. J., Dahlen, F. A., and Wieczorek, M. A. (2006). Spatiospectral concentration on a sphere. SIAM Rev., 48(504). Slepian, D. and Pollak, H. (1960). Prolate spheroidal wave functions, Fourier analysis and uncertainty — I. Bell Syst. Tech. J., 40(1):43–63. Tegmark, M. (1997). How to measure cmb power spectra without losing information. Phys. Rev. D, 55(10):5895–5907. Wandelt, B. D., Larson, D. L., and Lakshminarayanan, A. (2004). Global, exact cosmic microwave background data analysis using gibbs sampling. Phys. Rev. D, 70(8):083511. Wieczorek, M. A. and Simons, F. J. (2007). Minimum-variance spectral analysis on the sphere. J. Fourier Anal. Appl., 13(6):665–692, 10.1007/s00041–006–6904–1.