Frode Kristian Hansen - Data Analysis of The Cosmic Microwave Background

Data Analysis
of
the Cosmic Microwave Background.
Dissertation der Fakult at f ur Physik
der
Ludwig-Maximilians-Universit at M unchen
vorgelegt von: Frode Kristian Hansen
aus Tnsberg
M unchen, den 11.01.2002
i
Data Analysis
of
the Cosmic Microwave Background.
Dissertation der Fakult at f ur Physik
der
Ludwig-Maximilians-Universit at M unchen
vorgelegt von: Frode Kristian Hansen
aus Tnsberg
M unchen, den 11.01.2002
1.Gutachter: Prof. S. White
2.Gutachter: Prof. A. Schenzle
Tag der M undlichen Pr ufung: 12.06.02
i
Contents
Einleitung und Zusammenfassung . . . . . . . . . . . . . . . . . . 1
Introduction and Summary . . . . . . . . . . . . . . . . . . . . . . 5
1 The Physics of the CMB uctuation 9
1.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.1 The Friedmann Equations . . . . . . . . . . . . . . . . . . 10
1.1.2 Problems in Standard Cosmology . . . . . . . . . . . . . . 14
1.2 Theories of the Early Universe . . . . . . . . . . . . . . . . . . . . 16
1.2.1 Ination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.2 Cosmic Strings . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 The Growth of Perturbations in the Early Universe . . . . . . . . 20
1.3.1 Linear Evolution . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.2 Nonlinear Evolution . . . . . . . . . . . . . . . . . . . . . 25
1.3.3 The Matter Power Spectrum . . . . . . . . . . . . . . . . . 26
1.4 The Recombination Era and the CMB . . . . . . . . . . . . . . . 28
1.4.1 The Origin of the CMB Anisotropies . . . . . . . . . . . . 29
1.4.2 The CMB Power Spectrum . . . . . . . . . . . . . . . . . . 32
1.4.3 Reionization and Secondary Anisotropies . . . . . . . . . . 38
1.4.4 Polarisation of the CMB and Tensor Perturbations . . . . 39
2 The Analysis of CMB Data 45
2.1 CMB Experiments, the Past, Present and Future . . . . . . . . . 45
2.1.1 COBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.2 BOOMERANG . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.3 MAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.1.4 Planck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.1.5 Other Experiments . . . . . . . . . . . . . . . . . . . . . . 52
2.2 The Analysis of CMB Data Sets . . . . . . . . . . . . . . . . . . . 53
2.2.1 Map Making . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.2.2 Foregrounds . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.3 Power Spectrum Estimation . . . . . . . . . . . . . . . . . . . . . 62
2.3.1 Likelihood Estimation . . . . . . . . . . . . . . . . . . . . 63
2.3.2 Quadratic Estimators . . . . . . . . . . . . . . . . . . . . . 68
2.3.3 Two New Methods for Power Spectrum Estimation . . . . 70
ii
3 Fast Exact Power Spectrum Analysis for a Special Type of Scan-
ning Strategies 71
3.1 Fast Fourier Space Convolution . . . . . . . . . . . . . . . . . . . 72
3.2 Power Spectrum Estimation Using Scanning Rings . . . . . . . . . 76
3.2.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.2.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4 Gabor Transforms on the Sphere and Application to CMB Anal-
ysis 95
4.1 The Gabor Transformation and the Temperature Power Spectrum 97
4.1.1 The One Dimensional Gabor Transform . . . . . . . . . . 97
4.1.2 Gabor Transform on the Sphere . . . . . . . . . . . . . . . 98
4.1.3 Rotational Invariance . . . . . . . . . . . . . . . . . . . . . 108
4.2 Likelihood Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.2.1 The Form of the Likelihood Function . . . . . . . . . . . . 110
4.2.2 The Correlation Matrix . . . . . . . . . . . . . . . . . . . . 114
4.2.3 Including White Noise . . . . . . . . . . . . . . . . . . . . 116
4.2.4 Likelihood Estimation and Results . . . . . . . . . . . . . 119
4.3 Extensions of the Method . . . . . . . . . . . . . . . . . . . . . . 134
4.3.1 Multiple Patches . . . . . . . . . . . . . . . . . . . . . . . 135
4.3.2 Monte Carlo Simulations of the Noise Correlations and Ex-
tention to Correlated Noise . . . . . . . . . . . . . . . . . 144
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5 Gabor Transform on the Polarised CMB Sky 149
5.1 The Gabor Transformation . . . . . . . . . . . . . . . . . . . . . . 150
5.1.1 Polarisation Powerspectra . . . . . . . . . . . . . . . . . . 150
5.1.2 Rotational Invariance . . . . . . . . . . . . . . . . . . . . . 170
5.2 Likelihood Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.2.1 The Form of the Likelihood Function for Polarisation . . . 171
5.2.2 The Polarisation Correlation Matrix . . . . . . . . . . . . 178
5.2.3 Polarisation with Noise . . . . . . . . . . . . . . . . . . . . 182
5.3 Results of Likelihood Estimations . . . . . . . . . . . . . . . . . . 186
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
A Rotation Matrices 191
B Spin-s Harmonics 192
C Some Wigner Symbol Relations 193
D Recurrence Relation 194
iii
E Extention of the Recurrence Relation to Polarisation 197
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Lebenslauf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
iv
v
Einleitung und Zusammenfassung 1
Einleitung und Zusammenfassung
Das fr uhe Universum bestand aus einem heien, dichten und ionisierten Gas aus
Elektronen, Protonen, Neutronen, einigen leichten Atomkernen und Photonen.
Das Universum war in dieser Zeit optisch dicht da aufgrund der h augen Zusam-
menst osse mit Elektronen in dem dichten Gas, die Photonen sich nicht sehr weit
bewegen konnten. Die Temperatur war so hoch, dass die Elektronen, Protonen
und Neutronen nicht zu Atomen rekombinieren konnten. Als das Universum ex-
pandierte, k uhlte es sich ab. Ungef ahr 300 000 Jahre nach dem Urknall betrug
die Temperatur des Gases etwa 3000 Grad Kelvin. Unterhalb dieser Temper-
atur ist die Bildung der ersten Atome des Universums m oglich. Die Elektronen
und Protonen rekombinierten zu Atomen. Im Allgemeinen ist die Wahrschein-
lichkeit daf ur, dass ein Photon mit einem Atom zusammenst ot, viel geringer als
die Wahrscheinlichkeit daf ur, dass es mit freien Elektronen oder Protonen kolli-
diert. Aus diesem Grund sagt man, dass das Universum durchsichtich wurde, als
die Elektronen mit den Protonen rekombinierten. Die Photonen bewegten sich
entlang einer Geraden ohne dass sie abgelenkt wurden. Ungef ahr 12 Milliarden
Jahre sp ater trafen die Photonen einen Detektor auf einem Planeten names die
Erde und gaben den Wissenschaftlern wertvolle Informationen uber den Anfang
des Universums. Diese Strahlung, die auf ihrer Reise von den fr uhesten Zeiten
bis heute kaum eine Ver anderung erfahren hat, wird der kosmische Mikrowellen-
hintergrund genannt, und ist das Thema dieser Doktorarbeit.
Das dichte Gas des fr uhen Universums war nicht ganz homogen. Prozesse
der ersten 10
34
Sekunden nach dem Urknall verursachten kleine Schwankun-
gen in der Dichte. Die Entwicklung dieser Schwankungen unter dem Einuss von
Gravitation und Druck kann man mit Hilfe der Hydrodynamik einfach berechnen.
Nachdem das Universum durchsichtich wurde (Rekombination), entwickelten sich
die Schwankungen und sind heute als Sterne, Galaxien und Galaxienhaufen sicht-
bar. Die Bildung dieser Objekte ist eine sehr komplizierte Prozess, die noch
nicht vollst andig verstanden ist. Die Entwicklung der Strukturen des Universums
bevor Rekombination kann man mit linearer Physik beschreiben. H atte man die
M oglichkeit, das Universum in diesem fr uheren Stadium zu beobachten, h atte
man Informationen uber die Eigenschaften und Anfang des Universums auf eine
einfache Weise erhalten k onnen. Der kosmische Mikrowellenhintergrund (CMB)
gibt eine solche M oglichkeit. Diese Strahlung ist vom Zeitpunkt der Rekombina-
tion bis heute nahezu ohne St ohrung gereist und ist somit ein Bild vom Universum
und der Verteilung der Dichteschwankungen zu diesem Zeitpunkt.
Der CMB wurde zuerst von Penzias und Wilson 1965 als isotrope Schwarzk orper-
strahlung mit einer Temperatur von ca. 2.7 Grad Kelvin entdeckt. Wie oben
erw ahnt war die Temperatur des Gases zum Zeitpunkt der Rekombination ca.
3000 Grad Kelvin, da das Universum aber seitdem ungef ahr 1000 mal gr oer
geworden ist, ist die Temperatur dieser Schwarzk orperstrahlung auch genau um
diesen Faktor kleiner geworden. Die Schwankungen der Dichte des Universums
zur Zeit der Rekombination verursachten kleine Unregelm aigkeiten in der Tem-
peratur des CMB, die zuerst 1992 mit dem Cosmic Background Explorer (COBE)
Satellit entdeckt wurden. In den letzen Jahren sind mehrere Experimente mit im-
mer h oherer r aumlicher Au osung von der Erde und von Ballons aus durchgef uhrt
worden. Diese Experimente haben schon wertvolle Informationen uber das fr uhe
Universum gegeben. Ein anderer Satellit namens MAP f uhrt jetzt Beobachtungen
des CMB durch und der Planck Satellit, der in einigen Jahren ins All geschickt
wird, wird Beobachtungen mit noch h oherer r aumlicher und spektraler Au osung
durchf uhren.
Da die Experimente immer h ohere Au osung erreichen und immer gr oere
Teile des Himmels beobachten, wird es auch immer schwieriger, die Daten zu
analysieren. Die Analyse der Daten jetziger Experimente ist bereits ein Problem
und die Menge der Daten von MAP und Planck werden um ein Vielfaches gr oer
sein. Die Standardmethode, um Informationen (kosmologische Parametern) aus
CMB Daten zu extrahieren ist die Maximum-Likelihood Methode. Diese Meth-
ode fordert die Invertierung einer N N Matrix, wobei N die Anzahl der Pixel
eines Experimentes ist. Diese Invertierung erfordert N
3
Berechnungen. F ur das
Planck Experiment wird N einen Wert von mehr als zehn Millionen haben. Die
Invertierung einer Matrix dieser Gr oe dauert mehrere Tausend Jahre mit den
schellsten Supercomputern. Daher muss man andere Methoden nden um die
Daten analysieren zu k onnen. In dieser Doktorarbeit werde ich zwei neue Meth-
oden pr asentieren, die es erm oglichen, CMB Daten schnell zu analysieren. Mit
diesen Methoden wird man das CMB Leistungsspektrum -die sph arisch harmonis-
chen Koezienten der CMB Temperatur Daten- schnell extrahieren k onnen. Mit
Hilfe des Leistungsspektrums kann man die kosmologischen Parametern einfach
berechnen.
In Kapitel (1) wird die Physik des fr uhen Universums beschrieben. Eine
Zusammenfassung der Entstehung und der Entwicklung der Strukturen des Uni-
versums vom Urknall bis heute wird pr asentiert. Die physikalischen Prozesse,
die die Hintergrundstrahlung verursacht haben, werden zusammengefasst, und es
wird gezeigt, wie diese Prozesse eine Abh angigkeit des Leistungsspektrums von
den kosmologischen Parametern verursachen. In Kapitel (2) wird einen

Uberblick
uber vergangenen, heutigen und zuk unftigen CMB Experimente gegeben, und
die Standardmethoden f ur Analyse der CMB Daten werden erl autert. In diesem
Kapitel wird auch beschrieben, wie man eine CMB Karte von den Zeitgeordneten
Daten (TOD) eines CMB Experimentes macht und wie man St orungen durch
die Mikrowellenstrahlung anderer K orper entfernen kann. Zum Schluss des Kapi-
tels werden die neusten Methoden f ur die Extraktion des Leistungsspektrums aus
CMB Daten beschrieben.
In Kapitel (3) wird beschrieben, wie man mit einer Maximum-Likelihood-
Methode das Leistungsspektrum direkt von den TOD extrahieren kann, ohne
zuerst eine CMB Karte zu erstellen. Diese Methode ist auf Experimente wie
MAP und Planck anwendbar die den Himmel in Ringen von Ringen vermessen.
Symmetrien dieser Abtastmethode machen die Korrelationsmatrizen f ur Signal
und St orung blockdiagonal im Fourierraum. Aus diesem Grund kann man die
Korrelationsmatrix mit N
2
Berechnungen statt N
3
invertieren. Das erm oglicht
die Verwendung der Maximum-Likelihood Methode f ur solche Experimente. Weil
wirkliche Experimente, Schwankungen von der hier angenommenen idealen Ab-
tastmethode haben werden, werden Erweiterungen dieser Methode f ur solche Ex-
perimente diskutiert.
In Kapitel (4) wird eine andere Methode mit der man das Leistungsspektrum
von CMB Daten schnell extrahieren kann, vorgestellt. In diesem Kapitel wird
auch erl autert wie man Windowed Fourier Transforms (auch als Gabor Trans-
forms bekannt) vom Eindimensionalen auf die 2-dimensionale Kugelober ache
erweitert. Es wird gezeigt, wie man das Leistungsspektrum von einem Ausschnitt
des Himmels (Pseudo-Leistungsspektrum) als Eingangsdaten f ur eine Maximum-
Likelihood Analyse verwenden kann. Die Korrelationsmatrix der Maximum-
Likelihood Methode ist in diesem Fall viel kleiner als wenn alle Pixeln eines
Experimentes benutzt werden. Aus diesem Grund ist die Invertierung der Ma-
trix schnell. Es wird gezeigt, dass die Annahme einer Gauschen Verteilung der
Pseudo-Leistungsspektrumskoezienten eine gute Ann ahrung ist. Es wird auch
gezeigt, wie man CMB Daten vom ganzen Himmel analysieren kann, wenn man
den Himmel in viele kleinere Ausschnitte aufteilt und alle Ausschnitte gleichzeitig
analysiert. In Kapitel (5) wird die Methode f ur Analyse der CMB Polarisation
Daten erweitert.
Introduction and Summary 5
Introduction and Summary
The early universe consisted of a dense, hot and ionised gas of electrons, protons,
neutrons, some light atomic nuclei and photons. The universe at that time was
optically thick as the photons could not travel very far before being scattered
on electrons in the dense gas. The temperature was too hot for the electrons
to combine with the nuclei and form atoms. But as the universe expanded it
cooled. About 300 000 years after the Big Bang, the temperature of the dense
gas lling the universe was about 3000 degree Kelvin. This temperature allowed
the formation of the rst atoms in the universe. The electrons combined with
the protons to form atoms. The probability for a photon to scatter on a neutral
atom is much less than the probability to scatter on free electrons and protons.
For this reason it is said that the universe got transparent when the electrons
were bound to the protons. The photons continued travelling in a straight line
without being scattered. About 12 billion years later some of these photons hit a
detector on the planet called the earth. And it provided scientists with valuable
information about the origin of the universe. This radiation which has travelled
more or less unchanged from the earliest times until today is called the cosmic
microwave background radiation, and is the topic of this Ph.D. thesis.
The dense gas in the early universe was not completely homogeneous. Pro-
cesses during the rst 10
34
seconds after the Big Bang created small density
uctuations. The evolution of these uctuations under the inuence of gravita-
tion and pressure in the dense gas can be easily calculated using hydrodynamics.
After the universe became transparent (known as the recombination era), these
small density uctuations evolved to be the structure that can be seen in the
present universe, stars, galaxies and clusters of galaxies. The formation of these
objects is physically very complicated and not completely understood. If one
had the possibility to observe the universe at an earlier stage when structure
formation was still governed by known linear physics, it would be easier to gain
information about the properties of the universe and its origin. The cosmic mi-
crowave background (CMB) radiation provides such a possibility. The radiation
has travelled from the recombination epoch until today and gives a picture of the
universe and the distribution of density uctuations at the recombination epoch.
The CMB was rst detected by Penzias and Wilson in 1965 as a black body ra-
diation with a temperature of 2.7 Kelvin coming from all directions. As explained
abovem the temperature of the gas at recombination was about 3000 Kelvin, but
since that time the universe has expanded by a factor of roughly 1000 which has
decreased the temperature of the CMB photons by a similar factor. The density
uctuations in the universe at the recombination epoch gave rise to small uc-
tuations in the CMB temperature. These uctuations were rst detected by the
Cosmic Background Explorer (COBE) satellite in 1992. In the recent years sev-
eral ground and balloon borne experiments have made observations of the CMB
at increasing angular resolution. These observations have already provided valu-
able information about the early universe. Another satellite called Microwave
Anisotropy Probe (MAP) is currently observing the CMB and a satellite called
Planck will make observations with an even higher angular and spectral resolu-
tion in a few years. As the experiment are getting higher resolution and larger
sky coverage the task of analysing the data is gradually getting harder. Already
the analysis of the present datasets have presented a challenge and the datasets
from MAP and Planck will be several times larger. The standard method of
extracting information in terms of cosmological parameters from the CMB data
has been the method of maximum-likelihood. Unfortunately this method requires
the inversion of a N N correlation matrix where N is the number of pixels in
the experiment. This takes of the order N
3
operations. For Planck N will be of
the order of tens of millions and the inversion of the corresponding matrix takes
thousands of years on the fastest supercomputers. Other methods for analysing
the data have to be found. The aim of this thesis is to present two new methods
to analyse CMB temperature and polarisation data. The methods described in
the thesis aim at extracting the angular power spectrum of the CMB which is
the spherical harmonic transform of the CMB sky. From this power spectrum
the cosmological parameters can be extracted.
In chapter (1) a description of the physics of the early universe is given. A
summary of the formation and evolution of structure from the Big Bang until
the present era is presented. The processes giving rise to the CMB is also out-
lined and it is shown how these processes give a dependency on cosmological
parameters in the angular power spectrum of the CMB. Then in chapter (2) the
past, present and future CMB experiments are reviewed and the standard meth-
ods of data analysis are presented. The chapter also contains a description of
the process of making a CMB sky map from the time ordered data (TOD) of
a CMB experiment and the removing of foreground contamination. Finally the
last methods of extracting the angular power spectrum are reviewed.
In chapter (3) a new method is presented for extracting the CMB power
spectrum using the maximum-likelihood method directly on the TOD instead of
making a CMB map rst. The method is applicable to experiments scanning on
rings of rings on the sky like the MAP and Planck experiments. Symmetries of
this scanning strategy make the signal and noise correlation matrix for the likeli-
hood calculation block diagonal in Fourier space. For this reason the matrix can
be inverted in N
2
operations instead of N
3
. This makes the maximum-likelihood
method feasible for this kind of experiments. The method is exact and naturally
takes into account arbitrary beam shapes and side lobes which other existing
methods can not deal with. As realistic experiments can have deviation from
the ideal scanning strategy assumed, extensions of the method to these cases is
discussed.
In chapter (4) a dierent approach of estimating the angular power spectrum
of the CMB is shown. In the chapter it will be explained how the windowed
Fourier transforms (known as Gabor transforms) can be extended from the line
to the sphere. It will be shown how one can take a patch on the CMB sky and use
the power spectrum of this patch (the so called pseudo power spectrum) as input
to a maximum-likelihood procedure. The correlation matrix for this likelihood
is much smaller than when using all the pixels of the experiment and matrix in-
version is very fast. The assumption is that the coecients of the pseudo power
spectrum is Gaussian distributed which is shown to be a good approximation.
It is also shown how many such patches can be combined and in this way it is
possible to analyse datasets with observations of the full CMB sky by breaking
them up into several patches which are all analysed simultaneously. In chapter
(5) the method is extended to CMB polarisation.
Chapter 1
The Physics of the CMB
uctuation
In this chapter I will discuss the physical processes that gave rise to the uctu-
ations in the Cosmic Microwave Background (CMB). I will start by introducing
some basic concepts in cosmology (sect. 1.1), and discuss some main problems
that standard cosmology (sect. 1.1.2) is facing. In sect. 1.2.1 I will briey discuss
the theory of ination in the early universe and show how this theory may solve
some of the problems in cosmology as well as explain the origin of structure in the
universe. In sect. 1.2.2 an alternative theory of structure formation will briey
be reviewed.
I will proceed by a description of how the density perturbation produced in
the early universe were growing. In the early stages, the evolution of primordial
density perturbation were linear (sect. 1.3.1). In the last stages also nonlinear
eects became important (sect. 1.3.2). I will discuss the physics of the recombi-
nation era where light decoupled from matter (sect. 1.4.1) and gave rise to the
CMB. It will be shown that by studying the uctuations in the CMB, one actu-
ally studies the density uctuations at the recombination epoch when structure
formation was still governed by linear evolution. The density distribution in the
universe today has undergone nonlinear evolution by physical processes that are
not completely known. For this reason it is easier to reconstruct the primordial
eld of density uctuations and to estimate the values of the cosmological pa-
rameters from the spectrum of density uctuations at the last scattering surface.
How this can be done will be reviewed in sect. 1.4.2. In sect. 1.4.3 I will discuss
how processes that took place after recombination also contributed to small scale
power in the CMB.
The discussion in sect. 1.1 and 1.2 are when no other references are given,
built on the book (Kolb and Turner 1990) and the reviews (Watson 2000; Narlikar
and Padmanabhan 1991). Section 1.3 mostly follows the books (Kolb and Turner
1.1 Basic Concepts 10
1990; Coles and Lucchin 1995) and the review (Padmanabhan 1999). The paper
(Hu and Sugiyama 1995) and (Zaldariagga and Harari 1995) are the main sources
for section 1.4. All the power spectra used in this thesis were obtained using the
publicly available code CMBFAST (Seljak and Zeldariagga 1996). The code can
be found at the site http://physics.nyu.edu/matiasz/CMBFAST/cmbfast.html
1.1 Basic Concepts
In this section I will review the basic principles of cosmology (sect.1.1.1). Then
I will outline the main problems that the standard Big Bang scenario is facing
(sect. 1.1.2).
1.1.1 The Friedmann Equations
The Cosmological Principle (CP), is the foundation of almost all modern cosmol-
ogy. The CP consists of two parts,
1. Homogeneity: The universe is homogeneous on large scales
2. Isotropy: The universe is isotropic on large scales
The homogeneity principle says that the universe is supposed to look the same
at every point. The isotropy principle says that the universe is supposed to look
similar in all directions. On small scales this is obviously not the case, but ob-
servations indicate that this seems to be true on larger scales. One of the best
probes is the CMB which has been observed to be isotropic to one part in 10
5
.
The metric in a homogeneous and isotropic universe is the Friedmann-Robertson-
Walker (FRW) metric,
ds
2
= c
2
dt
2
a
2
(t)
_
dr
2
1 kr
2
+r
2
d
2
+r
2
sin
2
d
2
_
. (1.1)
Here, (r, , , t) are the comoving coordinates (coordinates moving with the ex-
pansion of the universe). The expansion factor of the universe is a(t), and k is
the curvature. By a proper rescaling of the coordinates, k can take the values 1
(hyperbolic space), 0 (at space) or 1 (spherical space).
The goal now is to nd the Friedmann equations, which are the equations of
motion of the universe. This can be achieved by inserting the FRW metric into
the Einstein equations. The Einstein equations read
R

1
2
g
R +
g
c
2
=
8G
c
4
T
, (1.2)
where R
is the Ricci tensor, R is the Ricci scalar, is the cosmological constant,

g
is the metric, G is the gravitational constant and T
is the stress-energy
tensor of the universe. These quantities are described in more detail elsewhere
(e.g. (Kolb and Turner 1990)). The left side of the equation contains terms which
are purely geometry dependent and the right side has all the matter and energy
in it. The energy/matter content of the universe was very dierent at dierent
epochs, but a quite general form of the stress-energy tensor is that of a perfect
uid. In a perfect uid T
is diagonal and takes the form (I have set c = 1 for

simplicity),
T
= diag(, p, p, p). (1.3)

Here is the energy density and p is the pressure of the uid.
So inserting the FRW metric (1.1) and the stress-energy tensor for a per-
fect uid (1.3) into the Einstein equations (1.2), one arrives at the Friedmann
equations:
a
a
=
4G
3
(
T
+ 3p
T
) , (1.4)
a
2
a
2
=
8G
T
3

k
a
2
. (1.5)
Here the dots denote derivatives with respect to time. The cosmological constant
is included here, by letting
T
= +
, (1.6)
p
T
= p
, (1.7)
where
=

8G
.
t
t
t
t
t
i
f
rec
eq
io
matter dominated era
radiation and matter decoupled
reionized universe
t
0
Planck era
inflation
radiation dominated era
vacuum energy dominated era
Figure 1.1: A sketch of the history of the universe (the time axis is not linear).
To show some applications of the Friedmann equations I will now concentrate
on two stages in the history of the universe. The radiation dominated era and
the matter dominated era. At cosmic time t
eq
10
4
years the energy densities
of matter and radiation in the universe were equal. Well before t
eq
there was
an era when the energy density in the universe was dominated by radiation. In
that epoch, the energy density due to radiation was much higher than the en-
ergy density due to matter, and one can use the equation of state for radiation
p = with = 1/3. After t
eq
, matter was dominating the energy density of
the universe. In the matter dominated era one can neglect the pressure ( = 0).
The matter dominated era includes the recombination epoch (t
rec
10
5
years) at
which matter and radiation decoupled. Recent observation indicate that there
is a cosmological constant (Jae et al. 2001) and that the universe at present
is dominated by vacuum energy (
matter
/
0.4). The equation of state of

vacuum energy is = 1. The history of the universe is sketched in gure (1.1).
First I will show how the energy density in the universe is changing with
the scale factor a. To do this, one can combine equation (1.4) and (1.5) to get
d
da
(a
3
) + 3a
2
p
a
2
4G
= 0. (1.8)
Using the equation of state, one can rewrite this as
d
da
(a
3(1+)
) =
a
2+3
4G
. (1.9)
One easily sees that for the matter dominated era, a
3
(actually =
k
1
a
3
+ k
2
where k
1
and k
2
are constants), for the radiation dominated era
a
4
and for the vacuum energy dominated epoch lna.
Using the Friedmann equations one can nd how the expansion parameter
varies with time. As a simple example I will just demonstrate the at universe
case with = 0. In this case one can set = k
1
a
3
for a matter dominated
universe into equation (1.5) and integrate the equation. One sees that a t
2/3
.
Similarly for a radiation dominated universe one has a
t. In a radiation
dominated universe, temperature T and density are related by T
4
giving
T t
1/2
. With this relation one can study how the temperature in th universe
varies with time in the early stages of the history of the universe.A similar argu-
ment yields for the matter dominated universe T t
2/3
.
I now dene some new parameters, namely
the Hubble constant
H =
a(t)
a(t)
. (1.10)
The redshift that one observes today from a given epoch t
z(t) =
a
0
a(t)
1, (1.11)
where a
0
= a(t
0
), where t
0
is the present time.
The density parameter is dened as
=

c
, (1.12)
where

c
is the critical density. This is the density at which the universe is exactly
at (k = 0). This can be directly found from equation (1.5), setting k = 0
and solving for the density
T
. This gives
c
(t) =
3H
2
(t)
8G
. (1.13)
The critical density today is
c
2 10
29
h
2
gcm
3
(where I used H
0
=
H(t
0
) = h 100kms
1
Mpc
1
).
Finally I will explain some expressions which will be useful. These are the
particle horizon, the event horizon and the Hubble radius.
The particle horizon is the distance which limits the range of casual com-
munications from the past. It is simply the most distant point from which
one can receive a light signal which was emitted at the beginning of the
universe.
Similarly the event horizon limits casual communications to the future. I
light signal emitted from us today will never reach beyond the event horizon.
Finally, the Hubble radius which is given by r
H
= cH
1
is in a normal
universe a good approximation to the particle horizon. The Hubble radius
is often used as an approximation to the size of a causally connected region
in the universe, and I will do this also in the following.
1.1.2 Problems in Standard Cosmology
The standard Big Bang scenario suers from several problems. I will now describe
some of the most important ones, known as the horizon problem, the atness
problem and the monopole problem. At the end I will also explain the problem
of formation of structure in the universe.
The horizon problem
The horizon problem is usually illustrated by comparing the size of casually
connected regions at the recombination epoch t = t
rec
and today t = t
0
.
One has that
_
trec
0
dt
a(t)

_
t
0
trec
dt
a(t)
, (1.14)
where the left hand side is the distance that a ray of light could have
traveled from the beginning of the universe t = 0 to recombination t = t
rec
and the right hand side is the distance to the last scattering surface (the
recombination epoch is often called the last scattering surface as this is the
most remote era that can be observed and is therefore where the photos
from this era was last scattered on electrons before they free streamed
to the present era). This means that any casually connected region at
recombination was much smaller than the size of the observable universe
today. Still the CMB which was produced in the recombination epoch has a
temperature which is uniform to 1 part in 10
5
in all direction. The horizon
problem is the problem of explaining how the CMB can have the same
temperature in two opposite direction, when these parts of the universe
have never been casually connected.
The atness problem
By using equation (1.4) and (1.5), one nds,
(t) 1 =
k
a
2
(t)H
2
(t)
. (1.15)
Using this equation at three epochs, t
0
, t
eq
and t, where t is now some time
before t
eq
, one nds
(t) 1
_
a
a
eq
_
2 _
a
eq
a
0
_
(
0
1) 10
4
(1 +z)
2
(
0
1), (1.16)
where H
2
a
3
and H
2
a
4
was used for the matter- and radiation-
dominated eras respectively. One sees that if the universe is not at (k ,= 0),
then it was very close to at at early times because
0
is of the order 1
today. To explain this one would need a process which makes the universe
very close to at in the beginning. And if the universe is exactly at, one
should be able to explain why. These questions are known as the atness
problem.
The monopole problem
In the earliest stages of the evolution of the universe, the universe was so
dense and hot that one needs to take quantum eld theory into account to
describe the physics of that era. As the universe went from a very hot to a
colder phase, several symmetries in eld theory should have been broken,
for instance as the electroweak interaction split up into the electromagnetic
and the weak interaction. Under such symmetry breakings, eld theory
predict the production of magnetic monopoles. Most theories however,
predict a very high density of such monopoles, so high that they should
1.2 Theories of the Early Universe 16
have dominated the universe. The question of why one doesnt see any of
these monopoles today is known as the monopole problem.
Finally I will discuss some problems connected with the formation of large
scale structure in the universe. There are two main problems.
Primordial seeds
In the Big Bang model one assumes a completely homogeneous and isotropic
universe. There is no process which can produce structure on the scales at
which one sees structure today. The problem is twofold, rst one needs
some process which is able to create inhomogeneities. Second, as will be
discussed in sect. 1.3, a density perturbation with length
0
= (t
0
) today
grows as (t) =
0
(a(t)/a
0
). If one assumes a(t) t
n
where n < 1 then
t
n
. The Hubble radius H
1
t. This means that at early times (small
t), the density perturbation was larger than a casually connected region.
There is no known physical process which could create such a perturbation.
Density perturbations in the CMB
The density perturbations at the recombination epoch which were discov-
ered by observations of the CMB, have a density of the order 10
5
relative
to the background density. Assuming linear evolution, these could not have
grown more than a factor z
rec
from decoupling till today. This means
that they should not have a density higher than a factor 10
2
relative to
the background density today. Observations show that they are of order
unity for scales larger than 8h
1
Mpc. There must have been other
eects contributing to the evolution.
1.2 Theories of the Early Universe
In this section I will briey review the theory of ination, as this theory seems
to be able to solve the problems mentioned in the last section in a very natural
manner. It also provides a mechanism for production of structure in the very
early universe. This is important as one can together with linear perturbation
theory, predict some properties of the density uctuations at the recombination
epoch which is what one observes when studying the temperature uctuations in
the CMB. I will also briey discuss cosmic strings which is an alternative theory
of structure formation.
1.2.1 Ination
The idea of ination comes from fundamental physics. Most theories of funda-
mental physics predict the existence of one or more scalar elds in the universe
(for instance the Higgs eld giving rise to the Higgs particle which existence is
predicted by eld theory but not yet conrmed). This scalar eld could have
undergone a phase transition in the early universe as the temperature of the
universe was dropping. Such a phase transition is expected in several Grand
Unied Theories (GUTs), where a higher symmetry group is broken down to
SU(3) SU(2) U(1). In this phase transition the minimum of the potential
energy for the scalar eld is shifted and the scalar eld rolls or tunnels from the
previous minimum to the new one.
Figure 1.2: The potential of a scalar eld
In Figure (1.2), I have sketched one possible potential for a scalar eld .
One sees that for a temperature T higher than the critical temperature T
c
, the
potential had a minimum at = 0. After the temperature dropped below T
c
,
a deeper minimum arose at = . From this point, the scalar eld was not
at its real minimum and the universe is said to have a false vacuum. Then the
scalar eld rolled or tunneled down to the real minimum (or real vacuum). At
the end, the scalar eld oscillated around the new minimum, interacting with
matter elds and creating particles. This is called the reheating period since the
universe again got hot after it cooled signicantly during the expansion period.
It can be shown that in such a phase transition, the equation of state of the
universe is p = . Looking at equation 1.4, one sees that the universe in this
case has positive acceleration. Moreover, if the phase transition occur in such
a way so that the energy density of the universe is roughly constant for a short
period of time, then equation 1.4 says that a/a is constant. This means that
a(t) e
At
. The expansion of the universe becomes exponential. A short period
with exponential expansion in the very early universe can solve most of the prob-
lems in the standard Big Bang scenario.
If one assumes that the scale factor during a short period of ination increased
by a factor of 10
30
, one can immediately see how the atness problem is solved.
By looking at equation (1.15), one can see that is driven to 1 by such a large
growth of the scale factor. So ination actually predicts the universe to be very
nearly at. For this reason, one can set k = 0 in the Friedmann equations
(Although k ,= 0 the term is so small compared to the other terms, that k is
eectively 0).
The horizon problem is also naturally solved in this scenario. By evaluating
the integrals in equation (1.14) using a(t) e
Ht
in the inationary epoch, one sees
that a light ray could have been traveling much longer (in physical coordinates)
from the beginning of the universe till recombination, than from recombination
and till today. In other words, the whole observable universe today was inside a
casually connected region before ination. This could explain the uniformity of
the CMB temperature. The monopole problem is similarly solved. During the
ination, the number of monopoles per comoving volume element was constant,
but the physical size of this volume element increased, so the physical density of
monopoles decreased with a large factor. Estimates with ination predicts at the
order of one monopole inside the current observable universe.
Finally ination can solve the problem of the origin of structure in the uni-
verse. Before and under ination, quantum uctuations in the scalar eld made
some regions of space to have a higher eld value + than the background.
These regions had a size of the order of a Planck length before ination. During
ination, the size of these regions (in physical coordinates) expanded with the
scale factor. At the end of ination, the size of these perturbation could have
been big enough as to act as seeds for the large scale structure that the universe
exhibits today. One important thing to notice here is that most theories predict
these quantum uctuations to be Gaussian distributed. This means that ination
predicts Gaussian density perturbations in the early universe. The evolution of
the perturbations from the end of ination to the recombination era is linear so
the density uctuations in that epoch should also be Gaussian. For this reason
checking that the temperature uctuations in the CMB are indeed Gaussian will
be an important test of ination.
Ination also has a prediction for the density uctuations. The prediction
says that the density uctuations should be almost scale invariant, meaning that
there should be a roughly equal number of uctuations on all physical scales. To
understand this one needs to study how the uctuations in the energy density
can arise during ination. Because of the presence of a horizon, there is conne-
ment in space which due to the uncertainty principle leads to uctuations in the
energy/momentum of the elds present. The size of these uctuations will be
dependent on the horizon size which is almost constant during ination. This
means that the size of the uctuations arising during ination will be the same
during the whole inationary period. Ination then expands these uctuations
out of the causal horizon and they are frozen in. For this reason, the uctuations
on all scales will have a constant amplitude when they reenter the horizon at a
later stage. This will be discussed further in section (1.3).
1.2.2 Cosmic Strings
As mentioned before, during phase transformations in the early universe, mag-
netic monopoles should have been created. Magnetic monopoles are just one
type of the so called topological defects. In this section I will briey mention
another type, called cosmic strings. The theory of cosmic strings was competing
with ination about being the main source of structure in the universe. During
the last years, this theory has been more or less ruled out, leaving ination as
the main explanation for the origin of large scale structure. However the cosmic
strings could still have played a role, even if they didnt give the main contri-
bution to structure formation. After all, theories of fundamental physics predict
these phase transformations which give rise to cosmic strings to have taken place,
and for this reason cosmic strings should also have been formed.
To explain what a cosmic string is, I will use as an example a complex scalar
eld which for a temperature T higher than a critical temperature T
c
, has its
minimum potential energy at = 0. At this temperature the Lagrangian and
the vacuum expectation value < >= 0 are invariant under local U(1) transfor-
mations e
i(x)
. At a temperature T < T
c
, the potential energy has a new
minimum at [[ = . In this case, the U(1) symmetry is broken, as the vacuum
expectation value is now not invariant under the transformation e
i(x)
.
The new vacuum expectation value < >= e
i(x)
can have a dierent phase
at dierent location (and will also necessarily have dierent phases, as dierent
points in space which are not casually connected cannot know what phase the
other points have.) But if one starts at one point and go through a loop in a two
dimensional plane and end up at the same point, the phase must change through
this loop with an integer multiple of 2. Say that one has such a loop with
= 2. If one shrinks this loop to a point, can not change continuously
1.3 The Growth of Perturbations in the Early Universe 20
from = 2 to = 0. For this reason, there must be a point in the middle
which doesnt have a phase, namely the old minimum < >= 0. So one ends
up with a string in space having a false vacuum, that is the potential energy of
the scalar eld along this string is higher than that for the true vacuum (where
the vacuum expectation value is at the minimum of the potential).
The cosmic string can act as gravitational attractors and seed structure for-
mation. Cosmic strings lead to non-Gaussian density uctuations and can in this
way be distinguished by the uctuations created by ination. Strings that formed
before ination probably got so diluted during ination that there are hardly any
left (this is what solved the monopole problem). If they formed after ination,
they could still be around and to some extend have contributed to structure for-
mation in the early universe. If they are there, they should be visible in the CMB
as non-Gaussian features.
1.3 The Growth of Perturbations in the Early
Universe
In this section I will review how density uctuations which arose in the very early
universe evolve. I will rst discuss the linear regime (sect.1.3.1) where the density
uctuations can be represented as =
0
+
1
, where
0
is the background density
and
1
is a perturbation
1

0
. I will discuss how the uctuations evolved
during the radiation dominated era, in the matter dominated era and beyond the
recombination epoch. In section 1.3.2 I will briey discuss the evolution when the
uctuations became so large that the linear perturbation theory becomes invalid.
Finally, the matter power spectrum which is used to quantify the uctuations is
dened in section 1.3.3.
1.3.1 Linear Evolution
In the early universe, the density uctuations where so small compared to the
background density that the use of linear perturbation theory is justied. Two
cases need to be distinguished. The perturbations which are signicantly smaller
than the Hubble radius can be treated with Newtonian dynamics. The super-
horizon size perturbations however, need a fully relativistic treatment.
I will start by the superhorizon sized perturbations. When the superhorizon
perturbations are treated with the relativistic formalism, there is an additional
complication due to a gauge freedom. When doing perturbation theory, one works
with a uniform background spacetime and a physical spacetime which is simi-
lar to the background spacetime but with small pertubations added to it. The
gauge freedom which arises, is the freedom to choose the correspondence be-
tween coordinated of points in the background spacetime and in the perturbed
physical spacetime. A gauge transformation is a change in this correspondence,
keeping the coordinates of the background spacetime xed (Bardeen 1980). The
treatment of superhorizon sized perturbations is traditionally done in the syn-
chronous gauge, but as this gauge has problems with coordinate singularities, the
Newtonian conformal gauge has been used in newer treatments. Some studies of
perturbations growth have been using gauge invariant quantities. For a review
of these dierent approaches, see (Ma and Bertschinger 1995). To avoid these
problems, I will use some simple arguments instead of the complete relativistic
formalism to estimate the growth of superhorizon sized perturbations. This gives
the same result as the relativistic approach.
Consider a at (k = 0) universe with a uniform density
0
and Hubble constant
H. The Friedmann equations give for this universe,
H
2
=
8G
3

0
. (1.17)
In this universe, one has a super horizon sized area with a slightly dierent density
0
+
1
. This part of the universe can be treated as a part of the universe which
is no longer at (because of the dierence in density) and therefore with k ,= 0.
Inside this area one has,
H
2
+
k
a
2
=
8G
3
(
0
+
1
). (1.18)
These two equations together give for the size of the density uctuations described
by =
1
/
0

1
0
(
0
a
2
)
1
. (1.19)
This gives for matter and radiation dominated universes a and a
2
re-
spectively.
As the uctuations grow, their sizes will at some point become smaller than
the horizon at a given epoch. They enter the horizon at a time t
enter
. When the
uctuations have entered the horizon, one can use Newtonian physics to describe
their further evolution by using the equations of motions for an expanding perfect
uid. The Eulerian equations for a perfect uid are given as,
t
+ (v) = 0, (1.20)
v
t
+ (v )v +
1
p + = 0, (1.21)
2
= 4G, (1.22)
where the expansion of the universe is not taken into account. This will be
included later. Here = (r, t), p = p(r, t) and = (r) are the density, pressure
and gravitational potential respectively at position r at time t. The uid velocity
is v = v(r, t). Now consider small perturbations of these quantities around
the background value. One can denote the background by subscript 0 and the
perturbation by subscript 1. One can then use
=
0
+
1
(1.23)
in the Eulerian equations. Here can be , p, v and . The resulting equations
for the perturbations are
1
t
+
0
v
1
= 0, (1.24)
v
1
t
+
v
2
s
1
+
1
= 0, (1.25)
1
= 4G
1
. (1.26)
Here v
s
is the speed of sound
v
2
s

_
p
_
adiabatic
=
p
1
1
, (1.27)
where the last equality comes about because of the assumption that there is now
spatial variations in the equations of state. Combining these equations one gets
the dierential equation
1
t
2
v
2
s
1
= 4G
o
1
. (1.28)
Solving for the density perturbation
1
, one gets
1
(r, r) = A
0
e
ikr+it
. (1.29)
Inserting this solution into the dierential equation (equation 1.28) one nds the
dispersion relation
2
= v
2
s
k
2
4G
0
. (1.30)
Here k = [k[. From the dispersion relation (equation 1.30), one easily sees that
for
k < k
J
, k
J
=
4G
0
v
2
s
, (1.31)
one gets imaginary values for . The wavenumber k
J
is called the Jeans wavenum-
ber and determines whether is real or imaginary. In the case of an imagi-
nary , one sees that the solution (equation 1.29) is exponentially growing with
time. In the other case, the perturbation is oscillating. So the Jeans wavelength
J
= 2/k
J
is the limiting wavelength. Perturbations bigger than this will be
exponentially growing and perturbations which are smaller will be oscillating.
Physically one can understand this as the battle between gravitation and pres-
sure. Gravitation is trying to make the perturbation bigger whereas pressure is
trying to make it smaller by resisting contraction. When the size of the pertur-
bation is bigger than the Jeans wavelength, the force of Gravity is big enough to
win over pressure.
To make this analysis cosmologically relevant, the expansion of the universe
has to be taken into account. For an expanding universe, the zeroth order solution
to the uid equations is
0
=
0
(t
0
)a
3
, (1.32)
v
0
= Hr, (1.33)
0
=
4G
0
3
r. (1.34)
Perturbing around this solution,
=
0
(t
0
)a
3
+
1
, (1.35)
v
0
= Hr +v
1
, (1.36)
0
=
4G
0
3
r +
1
, (1.37)
and solving the Eulerian equations after Fourier transforming one nds the dif-
ferential equation
k
+ 2H

k
+
_
v
2
s
k
2
a
2
4G
0
_
k
= 0. (1.38)
here
k
is the Fourier transform of the perturbation =
1
/
0
,
(r, t) = (2)
3
V
_

k
(t)e
ikr/a(t)
d
3
r, (1.39)
where V is the volume of integration. As before, the sign of the factor before the
last term in equation (1.38) is determining whether one gets growing or oscillating
solutions. The Jeans wavelength in the expanding universe is given by
k
J
=
_
4G
0
a
2
v
2
s
. (1.40)
If k k
J
the perturbation is growing. In this case one can write equation (1.38)
as
+ 2H

4G
0
= 0, (1.41)
where the k subscript is dropped, as this is valid for both and
k
. The exact
solution is dependent on the parameters of the universe. In a at = 1 uni-
verse in the matter dominated era, H =
2
3
t
1
and
0
=
1
6G
t
2
. Trying with the
solution = t
n
inserted into equation (1.41), one nds that t
2/3
is a grow-
ing mode solution. Apparently the expansion of the universe prevents the fast
exponential growth of perturbations in the non-expanding model described above.
To study the evolution in the radiation dominated era, one has to take into
consideration that there are dierent types of particles present. In this case,
the total mass density
0
is a sum over densities
j
of each species j present.
For a species i which is non-relativistic (this is what was assumed in the above
derivation), equation (1.41) can then be written
+ 2H

4G
0
j
= 0, (1.42)
where the sum over j is a sum over the non-relativistic species and
j
=
j
/
0
.
Assuming that there are only two species, photons and the non-relativistic species
i, the equation can be written as (
j
1 as photons are dominating in the
radiation dominated era),
+
1
t
= 0, (1.43)
with the solution lnt. This shows that in the radiation dominated era, the
perturbations can hardly grow.
From the above analysis, it seems that a signicant growth of perturbations
will be postponed to after t = t
eq
. After matter-radiation equality, the perturba-
tions could in principle have started growing, but only if there were perturbations
present which were big enough to start growing. If one assumes a spherical den-
sity perturbation of size , the mass of this perturbation is (assuming uniform
density)
M =
4
3

_
2
_
3
. (1.44)
For this perturbation to start growing, its size must be bigger than the Jeans
length
J
, or stated dierently, the mass of the perturbation M, must be bigger
than the Jeans mass M
J
, corresponding to the mass of a spherical perturbation
with length
J
. Assuming a model with two species, baryons and photons, it
can be shown that before recombination when the photons are coupled to the
baryons, the Jeans mass is bigger than the total mass inside the horizon. Hence,
the perturbation cannot start growing as there are now perturbation within the
horizon being big enough to have a stable growing mode. After recombination
when the pressure is dominated by the hydrogen atoms, the Jeans mass becomes
considerably smaller and perturbations can start growing.
From observations of the CMB, the density uctuations in the recombina-
tion epoch were of the order 10
5
and as mention in section (1.1.2), this is not
big enough for the perturbations to be able to grow to the current size today.
From the analysis in this section, one can see a solution to this problem. If there
was a particle species present which does not couple to the photons, density
perturbations in this species could have started growing just after t = t
eq
and
before recombination. The non-baryonic dark matter which existence has been
indicated by several observations (e.g. of rotation curves of galaxies), could be
such a species. As the species is not coupled to the photons, the evolution of
density perturbations was independent of the photon pressure and could therefor
have started growing before recombination. The perturbations in this species
would have grown between t = t
eq
and t = t
rec
. After recombination when the
baryonic matter perturbations started growing, the gravitational force from the
density perturbations in the non-baryonic component would increase the speed
of the growth of baryonic density perturbations.
1.3.2 Nonlinear Evolution
The linear perturbations theory for density uctuations in the early universe can
be used when the density perturbations are small 1. At some point, as
the perturbations were growing, it had reached a density which was consider-
ably higher than the background density of the universe and linear perturbations
theory breaks down. In the nonlinear regime, the evolution of the density per-
turbations was very complex. Here I will just briey consider the properties of
this nonlinear evolution.
The model used in the beginning of sect.(1.3.1), describing a at universe
(k = 0) containing a region with slightly dierent density can be used here. In
that model, the at universe has a scale factor a(t) and the part of the universe
containing an overdense region has a scale factor a
(t). As I now describe a region

which has a higher density than the surrounding at universe, this region will
expand and eventually collapse as the density is higher than the critical one. The
evolution of the scale factor a
in this closed universe with Hubble constant H
and density parameter
can be represented by a parameterized set of equations,

a
i
= (1 cos )

i
1
, (1.45)
H
i
t = ( sin )

i
2(
i
1)
3/2
. (1.46)
Here the index i describes the parameters at some xed time t
i
. Knowing the
parameters
i
, a
i
and H
i
one can use the equations to nd the expansion param-
eter a
at any given time. The density uctuation will have reached its maximum
expansion when = , giving
a
max
a
i
=

i
i
1
, (1.47)
H
i
t
max
=

2
i
(
i
1)
3/2

2
_
a
max
a
i
_
3/2
, (1.48)
with a density
max
=
i
_
a
i
a
max
_
3
=
3
32Gt
2
max
(
C
)
i
. (1.49)
Taking the time t
i
, so early that the density was the same as the background
density
i
=
i
= (
c
)
i
= (
c
)
i
. Then the density at the maximum expansion of
the perturbation is
max
=
3
32Gt
2
max
, (1.50)
giving
max
=

max
max
1 =
9
2
16
1 4.55. (1.51)
At the maximum expansion, the kinetic energy of the overdense region is
E
k
= 0, whereas the gravitational energy E
g
= C/r
max
where C is a constant.
After the maximum expansion, the region starts collapsing and collapses until
E
k
= E
g
/2. This is called virialisation and when the overdense region has viri-
alised it has a gravitational energy of E
g
= C/r
vir
. Equalizing the total energy
E
t
= E
k
+ E
g
at maximum and after virialisation one nds that r
vir
= r
max
/2,
meaning that the density has increased by a factor of 8 from maximum expansion
to virialisation.
1.3.3 The Matter Power Spectrum
In the previous sections I have described how the structures in the universe might
have arisen in the very early universe and how they should have evolved to the
present era. The next step is to nd some way of characterising this structure
so that the theoretical predictions can be compared to observations. The density
uctuations (x) at a position x in the universe at some early time t, is a random
variable. I will in the following be assuming that this is a Gaussian distributed
random variable with ensemble averaged mean value < >= 0. This is the pre-
diction of ination. As will be discussed, parts of the formalism is also valid if
these assumptions do not hold.
The rst thing I will dene is the autocorrelation function describing the
ensemble average correlation between (x) and the uctuation at some other
point (x +r) displaced by a vector r. This is simply dened as,
(r) =< (x)(x +r) > . (1.52)
This can easily be Fourier transformed to give,
(r) = (2)
3
V
1
_
[
k
[
2
e
ikr
d
3
k, (1.53)
[
k
[
2
= V
_
(r)e
ikr
d
3
r, (1.54)
where
k
is the Fourier transformed density uctuation given by equation (1.39)
and V is the integration volume. The power spectrum of matter density uctua-
tions is dened in this way as P(k) = [
k
[
2
. Since the density uctuations evolve
in time due to gravitational interactions, so does the power spectrum. The initial
power spectrum is often assumed to be a power law of the form P(k) = AV k
n
,
where n is the spectral index and A is the amplitude.
Since < > is zero, one could calculate what the rms uctuation
2
=<
2
>
is. Some algebra gives,
2
=
1
2
2
_

0
P(k)k
2
dk. (1.55)
This is the integral of the power spectrum over k-space. Unfortunately the quan-
tity
2
does not say anything about the relative contribution from dierent scales
k. For this reason, it is convenient to dene the mass variance for a given length
scale R. Dening a sphere of radius R, the mean mass < M > found inside such
a sphere is
< M >=< > V =
4
3
< > R
3
. (1.56)
This leads to the denition of the mass variance for a certain mass scale M,
2
M
=
(M)
2
< M >
2
=
< (M < M >)
2
>
< M >
2
. (1.57)
For the power law initial power spectrum, the initial mass variance for a mass
scale M can be shown to scale as
M
k
3/2
k
k
3
M
, (1.58)
where = (3 +n)/6 for the power law power spectrum.
As discussed in the section on ination (sect.1.2.1), the perturbations pro-
duced in ination are supposed to have the same amplitude at all scales. They
are supposed to be scale invariant. In this case, this refers to the perturbations
1.4 The Recombination Era and the CMB 28
produced in the gravitational potential . These perturbations can be estimated
using the shape of the Newtonian gravitational potential
G
M
M

M
M
M
R

M
M
2/3
. (1.59)
Looking at equation (1.58), one sees that for the initial power spectrum of per-
turbations to be scale invariant, must be 2/3 and therefore the spectral index
n must be one.
The evolution of the power spectrum with time can be found by studying
the evolution of with time. In linear perturbation theory, one can describe the
evolution of
k
(t) as
k
(t) = T
k
(t, t
i
)
k
(t
i
). (1.60)
Here T
k
(t, t
i
) is the transfer function which describes the evolution of the k mode
of density perturbations from some initial time t
i
to time t. This gives for the
power spectrum
P(k, t) = [T
k
(t, t
i
)[
2
P(k, t
i
). (1.61)
In the linear regime, one can nd the transfer functions, using the linear theory
of the evolution of density perturbations outlined above. In the nonlinear regime,
nding the transfer function is non trivial as it is not quite known which physical
processes were dominant during the epoch of nonlinear growth of perturbations.
Finally, when using the linear theory one nds that the power spectrum of
uctuations at the time when the uctuation with wavenumber k entered the
horizon is given by
P(k, t
enter
) k
n4
, (1.62)
so that
M
(t
enter
) k
n1
, (1.63)
which for the Harrison-Zeldovich spectrum is independent of k. This means that
when n = 1, all uctuations entered the horizon with the same amplitude.
1.4 The Recombination Era and the CMB
In this section I will discuss how the primary anisotropies in the CMB arose
(sect.1.4.1). I will describe how the linearized Einstein equations for the density
perturbations can be used together with the the Boltzmann equation to predict
the anisotropies at the last scattering surface. Then in section 1.4.2 I will dene
the CMB power spectrum and describe how it varies with some of the cosmological
parameters. The small scale secondary anisotropies which arose when the CMB
photons passed through the reionized part of the universe will be briey described
in section 1.4.3.
1.4.1 The Origin of the CMB Anisotropies
To predict the shape of the temperature uctuations T/T observed in the CMB,
one has to apply the Boltzmann equation describing the evolution of the photon
distribution function in a photon-baryon gas with Compton scattering. The linear
density perturbations will now be described using the gauge invariant potentials
and described in (Bardeen 1980). These are related to the total density
perturbations and the anisotropic stress by
k
2
= 4G
0
_
a
a
0
_
2
k
, (1.64)
+ =
8Gp
k
2
_
a
a
0
_
2
. (1.65)
Here (and in the rest of the section),
k
, , and denote the Fourier trans-
forms of the quantities , , and respectively. The scale factor is as before a,
the pressure is p. In a completely matter dominated universe where the pressure
p is unimportant, one sees that the two potentials are related by = .
The temperature anisotropy at position x in space at conformal time =
_
dt(a
0
/a) in the direction n will now be denoted (x, n, ), and its Fourier
transform
k
(, ) where k = [k[ and = k n/k. The Legendre expansion is
given by,
k
(, ) =
(i)
k
()P
(), (1.66)
where P
() is the Legendre Polynomial of order . Using these denitions, the

Boltzmann equation for the temperature anisotropies becomes
k
(, )+ik(
k
(, )+) =
+ ()[
k0
()
k
(, )
1
10
k2
()P
2
()iV
k
()].
(1.67)
Here a dot represents the derivative with respect to conformal time, V
k
is the
Fourier transform of the baryon velocity and = x
e
n
e
T
a/a
0
is the dierential
optical depth to Thompson scattering. The ionization fraction is given by x
e
, the
electron density by n
e
and the Thompson cross section by
T
.
The evolution of the linear density perturbations in baryons
b
and photons
is given again by the continuity and Euler equations

(
b
)
k
= k(V
k
k1
) +
3
4
(
)
k
(1.68)
V
k
=
a
a
V
k
+k +
(
k1
V
k
)
R
. (1.69)
Here R = 3
b
/(4
). The universe is now supposed to consist of 3 species, the

photons, the baryons and the dark matter. The density uctuations in the cou-
pled baryon-photon uid is described by the Boltzmann equation (1.67) and the
Euler equations (1.68, 1.69). The dark matter is not coupled to any of these
species and only shows up through its inuence on the gravitational potentials
and .
Before recombination the photons were continuously scattered on the elec-
trons. In such a case the photon distribution is isotropic in the electron rest
frame, so the dipole
k1
is only caused by the movement of the baryons,
k1
= V
k
.
Inserting this into equation (1.68) one gets (
b
)
k
= 3/4(
)
k
, which is the equa-
tion for for adiabatic evolution of the density uctuations. This also gives that
k
= 0 for 2. This is called the tight coupling limit.
Close to the tight coupling limit one can use the the tight coupling approxima-
tion. In this approximation equations (1.67), (1.68) and (1.69) can be expanded
to rst oder in the Compton scattering time
1
. This gives the resulting dier-
ential equation
k0
+
a
a
R
1 +R
k0
+k
2
c
2
s
k0
= F(), (1.70)
where F() is a term containing the gravitational potentials and c
s
is the sound
speed
c
s
=
1
3
1
1 +R
. (1.71)
This is the equation of a forced damped oscillator. The gravitational attraction of
the over density regions pull the matter into this region, but the photon pressure
resists the contraction, giving acoustic oscillations in the photon-baryon uid.
The solution to equation (1.70), can be written as
k0
() = a cos kr
s
() +b sin kr
s
() +c(F()), (1.72)
where the hat denotes that this is the solution in the thight coupling approxi-
mation. The dipole goes as k
k1
= 3(

k0
+

) which means that it oscillates
with an angle /2 out of phase with the monopole. Here r
s
is the sound horizon,
given as
r
s
() =
_

0
c
s
d
. (1.73)
Clearly, the amplitude of the dipole will be damped by a factor r
s
(1 +R)
1/2
with respect to the amplitude of the monopole.
In equation (1.72), the cos-term comes from adiabatic perturbations and the
sin-term from isocurvature perturbations. Adiabatic uctuations are perturba-
tions for which the entropy is constant causing the relation between the baryon
and photon perturbation to
b
= 3/4
. These kind of uctuations are predicted

by ination. The isocurvature perturbations are perturbations in the entropy.
For these perturbations which are expected by cosmic string models,
b
=

so that the total energy density is constant. Recent observations indicate that
adiabatic perturbations are dominant so that b 0 in the equation. For this
reason I will mainly discuss adiabatic perturbations in this thesis.
One sees that dierent k modes oscillate with a dierent frequency. At recom-
bination =
rec
, the monopole for some scales k is at its maximum or minimum
of the oscillations and will show up as peaks in the CMB temperature anisotropy
power spectrum. For some other scales the monopole will be zero, and the CMB
temperature power spectrum has a trough, which is not zero due to the dipole
(which has its maximum/minimum here due to the /2 phase shift). The peaks
and troughs in the temperature power spectrum resulting from these oscillations
will be described in more detail in section (1.4.2).
When recombination started, the mean free path of the photons became longer
as there were less and less electrons present on which they could scatter. The
photons were able to diuse through the baryons and in this way the tempera-
ture anisotropies at small scales were getting smaller. This is called Silk damping
(Silk 1968). Another result of the increasing diusion length, is that the CMB
photons that one can observe did not last scatter at the same time. The last scat-
tering surface has a nite width and the photons come from dierent depths in
the surface where the oscillations in the monopole
k0
had dierent phases. This
has the similar eect of smearing out the temperature anisotropies on small scales.
When scattering became less frequent the tight coupling limit breaks down
and the equations (1.67), (1.68) and (1.69) have to be expanded to second oder
in the Compton scattering time
1
. The solution to this set of equations can be
written as
(
k0
+ ) = (
k0
+ )e
(k/k
D
())
2
, (1.74)
where k
D
() is dependent on the diusion length of the photons at time . This
exponential damping of small scales is caused by the above mentioned eects of
photon diusion and nite width of the last scattering surface.
To nd the CMB temperature anisotropies today, the multipoles 2 of
k
have to be found. One can get these by solving the boltzmann equation (1.67),
ignoring the quadrupole which disappears in the tight coupling limit and make a
multipole expansion of the solution. The result is
k
(
0
) (
k0
+ )(
rec
)(2 + 1)j
(k
rec
) (1.75)
+
k1
(
rec
)[j
1
(k
rec
) ( + 1)j
+1
(k
rec
)]
+(2 + 1)
_

0
rec
(

)j
(k)d,
where j
(x) is the Bessel function, =

0
and
rec
=
0
rec
. The Silk
damping and the damping due to nite with of the last scattering surface are
included by
(
k0
+ )(
rec
) = (
k0
+ )(
rec
)D(k)
k1
(
rec
) =

k1
(
rec
)D(k), (1.76)
where
D(k) =
_

0
0
()e
(,
0
)
e
(k/k
D
())
2
d. (1.77)
One can see from equation (1.75) that the anisotropies in the CMB have
three main contributions. The rst term in the equation is called the ordinary
Sachs-Wolf eect (Sachs and Wolfe 1967). This is the sum of the monopole tem-
perature dierences on the last scattering surface and the gravitational potential
. The gravitational potential accounts for the redshift of the photons as they
climbed out of the potential wells of the density inhomogenieties at the last scat-
tering surface. The second term in equation (1.75) is due to a Doppler shift.
Because of the acoustic oscillations before and at recombination, the baryons
were moving. When the photons were last scattered on these moving electrons,
they experienced a Doppler shift. Finally, the last main contributor to the CMB
anisotropy is the integrated Sachs-Wolf eect. This eect arose after the photons
last scattered. This is due to a possible time change in the gravitational potential
after the recombination epoch. This term becomes important if matter-radiation
equality occurred close to recombination, or if the universe has got -dominated
after recombination.
Clearly, these three terms are dependent on cosmological parameters. If the
amount of dark matter and baryons are changed, the size of the density pertur-
bations and the gravitational potentials are also changed. This clearly results in
changes of the CMB anisotropies. The speed of expansion of the universe mea-
sured by the Hubble constant, is of course also important in these calculations.
As are the initial perturbations power spectrum produced by (possibly) ination.
In the next section, I will dene the CMB temperature power spectrum of uc-
tuations and outline how it is dependent on some of these parameters. Equation
(1.75) shows how important the CMB power spectrum can be for measuring the
cosmological parameters.
1.4.2 The CMB Power Spectrum
I start by dening the a
m
which are the coecients of a spherical harmonic
expansion of the temperature uctuations T/T(, ) on the sky,
a
m
=
_
S
d nT( n)Y
m
( n), (1.78)
where T( n) is the uctuation T/T in the position n and Y
m
( n) are the spherical
harmonic functions. The integration is performed over the full sky denoted by
S. If the initial density perturbations in the universe are Gaussian as predicted
by most inationary theories, so are the a
m
. The angular power spectrum of the
CMB is dened as,
C
m=
a
m
a
m
2 + 1
, (1.79)
where is the multipole. Small -values represent large angular scales and high
-values represent small angular scales. The relation between an -value and an
angular scale on the sky can be written as 180
/.
If the a
m
are Gaussian distributed, the power spectrum C
contain all the

information contained in the CMB. Assuming isotropy of the universe, one has
that < T( n)T( n
) >= C() where = n n
. This gives that the a

m
are
uncorrelated < a
m
a
m
>=

mm
< C
>. One sees from equation (1.79)

that for a given coecient C
, there are 2 + 1 independent statistical samples

of the average value for < C
>. This means that for the higher multipoles, the

statistics is better and C
for a given sky will be closer to the ensemble average

< C
>. The smaller multipoles has lower statistics and therefore suer from the
problem of cosmic variance.
The angular power spectrum can be determined from the power spectrum in
k-space derived in the last section, by
C
=
2V
(2 + 1)
2
_
k
2
[
k
(
0
)[
2
, (1.80)
where V is an arbitrary normalisation constant.
The Sachs-Wolfe eect is the dominant eect on large scales. I will now
estimate how the power spectrum C
depends on from the Sahcs-Wolfe eect.

At recombination, one can approximately set
k0
(
rec
)
2
3
(
rec
)
2
3
(
0
). (1.81)
Here it is assumed that the pressure is zero at recombination and that the po-
tential has not changed from the last scattering surface and until today. So, only
looking at the Sachs-Wolfe eect one gets from equation (1.75)
k
(
0
) (
rec
)(2 + 1)j
(kx), (1.82)
where x =
rec
. Using equation (1.80), one nds
C

_
k
2
[(
0
)[
2
j
2
(kx)dk. (1.83)
Now using equation (1.64), one sees that (
0
)
k
k
2
, so that
C

_
k
n2
j
2
(kx)dk
((2 +n 1)/2)
((2 + 3 n)/2)
. (1.84)
In the limit of a Harrison Zeldovich spectrum n = 1, one has,
C

1
( + 1)
(1.85)
Because of the strong -dependence from the Sachs-Wolfe eect, it is normal to
plot C
( + 1) in stead of C
. In gure (1.3) I have plotted the power spectrum

of a model with total density
t
= 1, cosmological constant
= 0.7, Hubble
constant H
0
= 100 h km/s/Mpc with h = 0.82, baryon density
b
h
2
= 0.82 and
power law index of the scalar power spectrum n = 0.975. These were the param-
eters found from the joint Maxima-Boomerang-COBE analysis (Jae et al. 2001).
Figure 1.3: The angular power spectrum with
t
= 1,
= 0.7
b
h
2
= 0.03,
h = 0.82, n = 0.975
Studying gure (1.3), the rst apparent features are the peaks and troughs.
As mentioned in the previous section, these are the result of the acoustic oscilla-
tions which were taking place in the photon-baryon plasma before recombination.
The peaks are resulting from the k modes of the monopole
k0
(
rec
) which were
at their minimum/maximum at recombination. The troughs occur where the
monopole was 0. These are lled in by the dipole which was oscillating /2 out
of phase and for this reason had its minima/maxima in the troughs. The dipole
however had a lower amplitude.
Assuming adiabatic uctuations, equation (1.72) tells us that one can expect
the rst peak at k = /r
s
, which is the scale of the sound horizon at last scatter-
ing. In a at universe, the sound horizon at the last scattering surface is expected
to have an angular size of about one degree on the sky. This means that for at
universes with adiabatic uctuations one should expect to see the rst peak at a
position of 200. This ts very well with the current data. When the universe
has less matter, the universe is open and the geometry of the universe is hyper-
bolic. In this case the same physical scales at recombination corresponds to a
smaller angle on the CMB sky than in the at universe. So for an open universe,
the rst peak was expected to be found at higher values. Also in the case of
isocurvature perturbations, the peak should have been at higher since in this
case the wave number for the rst peak should be k = /(2r
s
) corresponding to
larger scales. One already sees here how the CMB power spectrum can be used
to measure values and properties of the universe.
Figure 1.4: The solid line shows the angular power spectrum with
t
= 1.0,
= 0.7
b
h
2
= 0.03, h = 0.82, n = 0.975. The other curves are the power
spectra for lower total density =
t
. The ratios between
b
,
c
and
L
ambda
are kept constant.
In gure (1.4) one can see how the peaks are moving to larger scales as the
total density is decreasing. I have chosen the same model as in gure (1.3), but
varied the total density from 0.3 to 1.0 by keeping the ratios of baryon density,
dark matter density and vacuum energy density constant.
Another eect is the change of heights of the peaks as the baryon fraction is
changed. As described above, the peaks of the power spectrum come from the
maximum and minimum of the acoustic oscillations. The oscillations are driven
by the gravitational force contracting the density perturbations and the photon
pressure resisting this contraction. Every second peak, starting with the rst one
are coming from the contraction phase. The stronger the contraction, the higher
these peaks should be. The other peaks come from the rarefaction phase.
Figure 1.5: The solid line shows the angular power spectrum with
t
= 1.0,
= 0.7
b
h
2
= 0.03, h = 0.82, n = 0.975. The other curves are the power
spectra for dierent baryon densities
b
. The sum of baryons and dark matter is
kept constant,
b
+
c
= 0.3.
When the baryon content of the universe is higher, the mean free path of
the photon and the photon pressure decrease. This means that the contractions
will have a higher amplitude and the rarefactions a smaller amplitude. In other
words, raising the baryon fraction should increase the hight of the odd numbered
peaks in the angular power spectrum. Similarly the even numbered peaks should
get lower. This is clearly seen in gure (1.5). In the gure I have again used
the model in gure (1.3), but this time varied the baryon density to dark matter
density ratio, keeping the total density
b
+
c
= 0.3 and
= 0.7.
From these discussion it should be apparent that the angular power spectrum
of the CMB is an important source of the cosmological parameters. By estimat-
ing the power spectrum from measurements of the CMB one should be able to
measure the cosmological parameters with a very high precision. Some other
cosmological parameters that one can measure from the power spectrum include
the Hubble parameter h, the dark matter content
c
, the power law index of the
power spectrum of scalar and tensor perturbations (tensor perturbations will be
discussed in the next section) n
s
and n
t
, the redshift of reionization z
reion
, the
number of neutrino degrees of freedom at decoupling N
, the normalisation of
the matter power spectrum Q and the ration of tensor to scalar perturbations
Q
2
T
/Q
2
S
(Jungman, Kamionkowski, Kosowsky, and Spergel 1996).
1.4.3 Reionization and Secondary Anisotropies
After recombination, the universe was reionized at some redshift z
reion
, which
is still unknown. For this reason CMB photons might have been scattered on
electrons after recombination. This scattering might have inuenced the CMB
power spectrum. If reionization happened early (z
reion
> 30) this rescatter-
ing might have erased the peaks in the CMB power spectrum and thereby also
important information about the universe. As discussed above, the present ob-
servations have revealed a peak structure in the power spectrum indicating that
reionization happened in an epoch which was so late that the scattering after
recombination did not have a signicant inuence on the CMB on the scales
where most of the cosmological information is stored. They could however be the
dominant contribution to the power spectrum on smaller scales ( > 1000).
Here I list some eects that reionization might have on the power spectrum:
1. Vishniac eect: During the epoch of galaxy formation, there were huge
moving gas masses on which the CMB photons were scattered. The Dopple
shift from the scattering on these moving electrons should have inuenced
the angular power spectrum on smaller scales (Ostriker and Vishniac 1986;
Vishniac 1987; Persi, Spergel, Cen, and Ostriker 1995). The strength of
this eect and the scales at which it is important is dependent on the
reionization history of the universe.
2. Thermal Sunyaev Zeldovich eect: When CMB photons passed through hot
electron gas in clusters of galaxies and superclusters, their frequencies were
changed due to Compton scattering (Zeldovich and Sunyaev 1969). The hot
electrons in the clusters transferred energy to the photons resulting in an
upscattering of the photon frequency. For this reason the clusters resulted
in a temperature decrement at lower frequencies and increment at higher
frequencies. This eect has been observed and has been used to measure
the Hubble constant (see e.g. (Myers, Baker, Readhead, and Leitch 1997)).
The SZ eect is only expected to give signicant contributions to the CMB
power spectrum on smaller scales (Persi, Spergel, Cen, and Ostriker 1995).
3. Kinetic Sunyaev Zeldovich eect: Because of the peculiar velocities of the
clusters of galaxies, there will be an additional Doppler shift of the CMB
photons scattered on the electron gas in clusters. This is called the kine-
matic SZ eect and is about an order of magnitude smaller than the thermal
eect. The two eects can be distinguished because of their dierent spec-
tral signatures. The kinetic eect can in principle be used to measure the
peculiar velocities of clusters (Sunyaev and Zeldovich 1980).
There are also other small scale eects caused by gravitation. One of them
is associated with the integrated Sachs-Wolfe eect for the time varying gravi-
tational potentials from nonlinear evolution of density perturbation. The move-
ments of the gas in the density uctuations in the epoch of nonlinear growth was
making the gravitational potential time varying and thereby causing a gravita-
tional frequency shift of the CMB photons passing through the perturbation (see
equation (1.75)). This is called the Rees-Sciama eect (Rees and Sciama 1968)
and is contributing (being of the order of magnitude of the primordial pertur-
bations) to the power spectrum on very small scales (typically -values of a few
thousand).
Finally gravitational lensing of the CMB by foreground galaxies and galactic
clusters might also have inuenced the power spectrum. This eect tends to
smooth out the peak structure in the power spectrum (Seljak 1996).
1.4.4 Polarisation of the CMB and Tensor Perturbations
Some of the cosmological parameters have similar eects on the power spectrum.
For this reason several combinations of cosmological parameters may give the
same power spectrum. This degeneracy of parameters may be broken by looking
at the polarisation of the CMB. Since Thompson scattering produce linear pho-
ton polarisation , the CMB photons are expected to be linearly polarised.
Polarisation of photons are usually characterized by the Stokes parameters I,
Q, U and V . Here I is the total intensity, Q and U describe linear polarisation
and V circular polarisation. Since Thompson scattering is unable to produce cir-
cular polarisation, V is excpected to be zero in the CMB and will for this reason
not be treated further here.
I will now dene the Stokes parameters following (Chandrasekhar 1960). The
Q and U Stokes parameters can be dened by the components of the electric vec-
tor of electromagnetic radiation decomposed in two directions x and y which are
mutually perpendicular and also perpendicular to the direction of propagation of
the radiation. For a plane electromagnetic wave, the components can be written
as
E
x
= E
x0
cos(t
x
), (1.86)
E
y
= E
y0
cos(t
y
), (1.87)
where E
x0
and E
y0
are the wave amplitudes and
x
and
y
are phases factors.
One can then dene intensity and the Stokes parameters as
I = E
2
x0
+E
2
y0
, (1.88)
Q = E
2
x0
E
2
y0
, (1.89)
U = E
x0
E
y0
cos(
x
y
). (1.90)
As with the CMB temperature, the CMB polarisation will also have uctua-
tions which arose from the acoustic oscillation at the last scattering surface. It is
practical to make a spherical harmonic expansion of the CMB polarisation in the
same way as for the CMB temperature and describe the uctuations in terms of
polarisation power spectra. There are two approaches to this (Zaldarriaga and
Seljak 1997; Kamionkowski, Kosowsky, and Stebbins 1997). In this thesis I will
follow (Zaldarriaga and Seljak 1997).
As for the temperature anisotropies, the uctuations in Q and U can be
expanded in spherical harmonic coecients a
m
. For polarisation one has to use
tensor harmonics
2
Y
m
( n) and
2
Y
m
( n) instead of normal scalar harmonics
Y
m
( n). These tensor harmonics are described in appendix B. The coecients
can be dened as (from now on Q( n) and U( n) are taken to be Q/Q and U/U
to simplify notation.)
(Q+iU)( n) =
m
a
2,m 2
Y
m
( n), (1.91)
(QiU)( n) =
m
a
2,m 2
Y
m
( n). (1.92)
The inverse relations are
a
2,m
=
_
d n
2
Y
m
( n)(Q+iU)( n), (1.93)
a
2,m
=
_
d n
2
Y
m
( n)(QiU)(n). (1.94)
These spherical harmonic coecients can be redened in terms of their linear
combinations
a
E,m
= (a
2,m
+a
2,m
)/2, (1.95)
a
B,m
= i(a
2,m
a
2,m
)/2. (1.96)
The indices E and B only reects the fact that these two coecients have some
properties similar to the electric and magnetic elds. Using these coecients,
one can nally dene the three power spectra of polarisation
C
E
=
1
2 + 1
m
< a
E,m
a
E,m
>, (1.97)
C
B
=
1
2 + 1
m
< a
B,m
a
B,m
>, (1.98)
C
C
=
1
2 + 1
m
< a
T,m
a
E,m
>, (1.99)
where the last denition is the temperature-polarisation cross power spectrum.
From statistical isotropy, one has in analogy to the temperature case
< a
E,
m
a
E,m
> = C
E

mm
, (1.100)
< a
B,
m
a
B,m
> = C
B

mm
, (1.101)
< a
T,
m
a
E,m
> = C
C

mm
, (1.102)
< a
B,
m
a
E,m
> = < a
B,
m
a
T,m
>= 0. (1.103)
Figure 1.6: The E power spectrum with
t
= 1,
= 0.7
b
h
2
= 0.03, h = 0.82,
n = 0.975.
As showed in (Zaldarriaga and Seljak 1997), the polarisation power spectra for
some given model of the universe can be found along the same lines as described
in section (1.4.1) and (1.4.2). In addition to the Boltzmann equation for the pho-
ton intensity (1.67) one also has to use the Boltzmann equation describing the
evolution of photon polarisation in the presence of Thompson scattering. These
power spectra are again dependent on cosmological parameters and will give an
independent measure of these. The polarisation power spectra depend dierently
than the temperature power spectrum on the cosmological parameters. In this
way, the polarisation of the CMB can be used to break degeneracies between
parameters. In gures (1.6, 1.7) I have plotted the E and C power spectra for
the same model as in gure (1.3).
t
= 1,
= 0.7
b
h
2
= 0.03,
h = 0.82, n = 0.975.
The discussion so far has been concerned with scalar perturbations of the
metric. There can also be contributions from tensor perturbations. I will now
briey sketch the dierence between the two and what the eect of the tensor
perturbations can be on the CMB power spectra. In this I will follow (Ma and
Bertschinger 1995). For a metric tensor g
where , = 0, 1, 2, 3 the components

g
00
and g
0i
where i = 1, 2, 3 are by denition unperturbed in the synchronous
gauge. In this case, the perturbed line element can be written
ds
2
= a
2
()(d
2
+ (
ij
+h
ij
)dx
i
dx
j
), (1.104)
where is conformal time. There is an implicit sum over all equal indices. The
metric perturbation h
ij
can be written as
h
ij
= h
ij
/3 +h
ij
+h
ij
+h
T
ij
. (1.105)
Here the rst term is the trace part. The three last terms which together make
the traceless part, have the following properties
ijk
i
h
ik
= 0, (1.106)
j
h
ij
= 0, (1.107)
i
h
T
ij
= 0, (1.108)
where
ijk
is the totally antisymmetric Levi-Civita symbol. Obviously, h
ij
can
be written in terms of a scalar eld , and h
ij
can be written in terms of a
divergenceless vector eld A,
h
ij
= (
i
j
1/3
ij
2
), (1.109)
h
ij
=
i
A
j
+
j
A
i

i
A
i
= 0. (1.110)
The h and parts make up the scalar metric perturbation. The h
ij
is the vec-
tor perturbation and h
T
ij
is the tensor perturbation. The scalar perturbation have
already been treated above and the vector perturbations are not expected to
make any signicant contribution to the CMB. The tensor perturbations (gravity
waves) however, are predicted by most inationary theories.
Following (Turner, White, and Lidsey 1993), h
T
ij
can be Fourier expanded as
h
T
ij
(x, ) = (2)
3
_
d
3
kh
i
(k, )p
s
jk
e
ikx
, (1.111)
where p
s
ij
are polarisation tensors (s = 1, 2) which satisfy p
s
ij
k
j
= 0 and p
s
ij
ij
=
0. The evolution of these gravity wave perturbations can be described by the
massless Klein-Gordon equation
h
i
(k, ) + 2
a()
a()
h
i
(k, ) +k
2
h
i
(k, ) = 0. (1.112)
The solutions of this equation for modes that are well outside of the horizon says
that h
i
(k, ) is constant. For modes which are well inside the horizon, one gets
h
i
(k, ) cos k/a.
t
= 1,
= 0.7
b
h
2
= 0.03,
h = 0.82, n = 0.975, n
T
= 1 n, Q
T
/Q
S
= 7n
T
. The solid line is C
, the
dashed line is C
E
, the dotted line is C
B
and the dashed-dotted line is [C
C
[.
If the initial power spectrum of scalar uctuations goes as P(k) = Q
S
k
n
and
of tensor uctuations as P(k) = Q
T
k
n
T
, then most inationary theories predict
n
T
1 n and Q
T
/Q
S
7n
T
. Using this, I have plotted the dierent CMB
power spectra including B polarisation in gure (1.8). The model is the same as
in the previous gures, but now contributions from tensor perturbations are also
included. Clearly for a completely scale invariant n = 1 model, there will be now
contributions from tensor modes. Most theories of ination however, do predict
small deviation from a scale invariant power spectrum. This suggests that n
T
and thereby Q
T
are also very small. For this reason the C
B
power spectrum
which only has contributions from the tensor perturbations is a magnitudes of
order smaller than the other spectra.
Chapter 2
The Analysis of CMB Data
The previous chapter was discussing the physical theories behind the uctuations
in the CMB and how the observation of the CMB power spectrum can test these
theories as well as estimate their parameters. This chapter is a review of the pre-
vious, current and future experiments intended to measure the CMB uctuations
and how information can be extracted from these observations.
Section (2.1) is an overview of some of the biggest experiments. In sec-
tion (2.1.1), (2.1.2), (2.1.3) and (2.1.4) I will discuss the COBE satellite, the
Boomerang balloon borne experiment, the MAP satellite and the Planck satellite
respectively. I will briey mention the scan strategy and some of the parameters
of the instruments. Other experiments will briey be mentioned in section (2.1.5).
The problem of making CMB sky maps from a time stream of data will be
discussed in section (2.2.1). The CMB maps from observations contain in addition
to CMB, other sources of radiation which radiate at the same wavelengths as
the CMB. These foregrounds and how they can be removed from CMB maps is
discussed in section (2.2.2). Dierent methods of extraction of the CMB power
spectra from CMB maps with foregrounds removed will be reviewed in section
(2.3).
2.1 CMB Experiments, the Past, Present and
Future
This section is a short review of some of the CMB experiments. I will start with
the experiment which rst measured the CMB temperature uctuations. Then I
will discuss a balloon borne experiments which is already conducted but will y
again in an improved version. Then two satellite experiments are described, one
is already collecting data as this is written and the other will be launched in a
few years.
2.1 CMB Experiments, the Past, Present and Future 46
2.1.1 COBE
COBE was the rst experiment to measure the uctuations in the CMB. The
COBE satellite was launched in November 1989 into a circular orbit around the
earth at an altitude of 900km with a 99
inclination. The spacecraft was always

pointing away from the earth and was pointing in a direction being almost per-
pendicular to the direction of the sun. The orbital period was 103 minutes, giving
about 14 full revolutions around the earth a day. COBE was scanning the entire
sky once every 6 months, and it was operating for a total of 4 years.
COBE was equipped with three instruments, FIRAS, DMR and DIRBE. FI-
RAS was measuring the absolute temperature of the CMB, whereas DMR was
measuring the uctuations. The DIRBE instrument was measuring the cosmic
infrared background radiation. The FIRAS instrument was pointing along the
spin axis of the space craft. The two other instruments were both oset with
30
from the spin axis around which the space craft was spinning at a rate of 0.8
revolutions per minute.
The DMR beam was observing at the frequencies 31.5, 53 and 90GHz with
a 7
FWHM beam. After 4 years of observation, maps with 6144 pixels were
made for all 3 channels (Bennet et al. 1996). By cutting out the contaminated
galactic plane (a 40
belt around the equator of the map in galactic coordinates),

the power spectrum of the CMB up to = 30 could be estimated. This was done
for the 4 year data in (G orski et al. 1996). The power spectrum was shown to
be consistent with a n
s
= 1 Harrison-Zeldovich spectrum.
2.1.2 BOOMERANG
The BOOMERANG experiment is a cooperation between the universities in
Rome and institutes in America and Great Britain. This is a balloon borne
experiment which had its test ight in Texas in August 1997, lasting for 6 hours.
They collected data from a 200 square degree patch of the sky at two frequencies
90GHz and 150GHz. From the data, the power spectrum was estimated, the rst
peak was found and
t
was estimated (Mauskopf et al. 2000; Melchiorri et al.
2000).
The main ight of BOOMERANG was conducted in Antarctica at the end
of 1998 (Bernardis et al. 2000). The balloon was launched from the McMurdo
Station on December 29th, and returned after 259 hours at an altitude of about
38 km. The sky temperature was measured by comparing the temperature of
the incoming photons with an onboard thermal reference load. The gondola was
equipped with a 1.2m mirror and a cryogenic mm-wave bolometric receiver. Data
was collected from an area with minimal contamination from galactic dust(see
section (2.2.2)). This time, also the frequencies 240GHz and 400GHz were ob-
served. A rectangular area of about 100
by 30
was scanned and maps with 14
pixels have been made.

The dierent frequency channels were used to estimate the foregrounds. The
main foreground contamination in the BOOMERANG maps comes from ther-
mal emission from interstellar dust grains. The 410GHz channel which is dust
dominated, was used to measure the amount of dust contaminating the data, but
was found to be negligible (Nettereld et al. 2000). Three quasars were however
cut away from the map before the map was analysed to get the power spectrum.
In (Nettereld et al. 2000), the results of the power spectrum and cosmological
parameter estimations are showed. The two rst peaks of the angular power
spectrum are clearly visible and there are indications of a third peek. Using this
power spectrum, they estimated several cosmological parameters. Their estimate
of the total is close to one, conrming the prediction of ination. Also the index
of scalar uctuations n
s
is found to be very close to one, which ts very well with
the prediction of ination. The baryon fraction
b
is in very good agreement
with what has previously been derived from nucleosynthesis. BOOMERANG is
expected to y again in the end of 2001. This time it will also attempt to measure
the polarisation of the CMB.
2.1.3 MAP
MAP (Microwave Anisotropy Probe) is a NASA project. The satellite was
launched on June 30th 2001 and has at the time when this is written, reached its
destination which is the Lagrange L2 point. MAP is orbiting the L2 point which
is situated 1.5 mill kilometers away from the earth on the opposite side of the
sun, on the line between the sun and earth (see gure (2.1)). The advantage of
observing from this point is that the telescope can always point away from the
sun, earth and moon. The information in this section is taken from the MAP
homepage http://map.gsfc.nasa.gov/.
SUN
L2
150 mill km
1.5 mill km
EARTH
Figure 2.1: The position of the Lagrange L2 point. NB! The gure does not have
a linear length scale!!!
MAP will be spinning about its axis with a period of 2.2 minutes in its orbit
around the sun. The axis will also be precessing with a 22.5
angle about the

Sun-MAP line with a period of 1 hour. In this way MAP is scanning the sky on
circles with diameter 141
and observing 30% of the sky every day. It takes 6

months for the spacecraft to observe the full sky. MAP is designed to operate for
two years and produce 4 full sky observations.
MAP is a dierential experiment meaning that it measures the temperature
at one position in the sky and compares with the temperature at another. In
this way it measures the temperature dierence between points in the sky. This
is dierent from Planck and BOOMERANG which both compare the sky tem-
perature to a source with known temperature. The two points on the sky are
for MAP about 141
away (depending on the channel). This large distance is

necessary to maintain sensitivity to signals at large angular separation.
MAP is observing in 5 dierent frequency bands in order to separate galactic
foregrounds signals from the CMB. The rst channel is at 22GHz which is just
at the atmospheric water line. Frequencies below this can be observed from the
ground. This channel is dominated by galactic synchrotron and free-free radia-
tion (see section (2.2.2)). The other channels are at the frequencies 30, 40, 60 and
90GHz where the CMB is dominating. Above 100GHz, thermal dust radiation is
starting to dominate.
MAP is equipped with two Gregorian telescopes with 1.4m1.6m (elliptical)
primary mirror. The angular resolution is varying with the dierent frequency
channels. The lowest frequency channel has a beam with a FWHM of about 56
arcminutes. The highest frequency has 14 arcminutes FWHM. The sensitivity
per 0.3
0.3
is expected to be about 35K for each channel which means that

the power spectrum can be measured with reasonable accuracy to a multipole of
about = 1000.
2.1.4 Planck
Planck is a project of the European Space Agency ESA. The Planck satellite is
scheduled for launch in year 2007 together with the FIRST satellite. As with
MAP, Planck is also going to an orbit about the L2 Lagrange point. Planck will
also be spinning about its own spin axis about once every minute and thereby
scan the sky with rings. The telescope will be pointing 85
away from the

spin axis, giving circles with a diameter of 170
. All the information about

Planck in this section is from the Planck Homepage http://astro.estec.esa.nl/SA-
general/Projects/Planck/.
As MAP, Planck will be observing the full sky once every 6 months. It will
be using a 1.5m1.3m (o axis paraboloid) telescope. It will be observing in 10
frequency channels located between 30 and 1000GHz. Two dierent instruments
will be observing at dierent frequencies. The high frequency instrument (HFI) is
a bolometer array operating at 0.1K and covering the frequencies above 100GHz.
The low frequency instrument (LFI) is an array of tuned radio receivers based on
HEMT ampliers and observes the frequency range below 100GHz.
Figure 2.2: The sky map of a realisation of the power spectrum in gure (1.3).
Here a 10 arc minute FWHM beam has been used. This is typically what one
could expect from one of the Planck channels
The HFI will consist of an array of 48 bolometers split into 6 frequency chan-
nels at 100, 143, 217, 353, 545 and 857GHz. The choice of frequencies have
been made in order to remove foregrounds (thermal radiation from dust is the
most important sources at these frequencies), observe the CMB and to study the
Sunyaev-Zeldovich eect (see section (1.4.3). For instance the channel at 217GHz
is at the zero point of the thermal SZ eect, allowing for studies of the kinematic
eect. The angular resolution in the dierent channels varies from 5 arcminutes
for the highest frequencies to 11 arcminutes (FWHM) for 100GHz. The expected
sensitivity goes from 6670K per pixel for 857GHz to 1.7K per pixel for the
100GHz channel.
Figure 2.3: This is the same map as in gure (2.2). The only dierence is that
the beam has 7
FWHM beam, the same as COBE.

The LFI will have 56 detectors covering the frequencies 30, 44, 70 and 100GHz.
These frequencies are mostly dominated by CMB. The angular resolution for these
channels will be 33, 23, 14 and 10 arcminutes respectively. The sensitivity will
go from 1.6K to 4.3K for the lowest and highest frequency respectively.
In gures (2.2) and (2.3), the dierence in angular resolution by Planck and
COBE is illustrated. Planck is expected to get a reasonable estimate of the
power spectrum up to a multipole of = 2000. It will also attempt to measure
the polarisation power spectra.
In gure (2.4), I have plotted the power spectrum in gure (1.3) together
with the expected noise power spectrum for 2 Planck channels (143GHz HFI
and 100GHz LFI) and for a combination of 3 MAP channels. The colored areas
show the optimal error bars from estimations of single multipoles from the Planck
143GHz HFI channel and the three combined MAP channels. These error bars
are of course smaller by a factor

n if n C
are binned together.

Figure 2.4: The CMB power spectrum in gure (1.3) and the noise power spec-
trum of the Planck 143GHz HFI channel (solid line), the Planck 100GHz LFI
channel (dashed) and of the combination of the 3 highest frequency channels on
MAP (dotted line). The light red colored area shows the expected error bars
from the 143GHz channel on Planck and the dark red area shows the error bars
from the three combined MAP channels. These are the optimal error bars if each
single multipole is estimated.
2.1.5 Other Experiments
The MAXIMA-1 balloon borne experiment was conducted on August 2, 1998.
Data was collected for a few hours by a 1.3 meter telescope. The detectors which
were bolometers similar to those planned for Planck HFI, were centered at the
frequencies 150, 240 and 410GHz. The sky was mapped with a 10 arc minute
FWHM beam in all frequencies. The map and power spectrum estimated from
the MAXIMA-1 data is presented in (Hanany et al. 2000). The data shows the
two rst peaks in the power spectrum. In (Jae et al. 2001), the results of a
joint analysis of the MAXIMA-1 and BOOMERANG data sets are presented.
The Tophat experiment is another balloon borne experiment. It was launched
2.2 The Analysis of CMB Data Sets 53
on the south pole January 4th 2001 and observed till January 31th, mapping 6%
of the sky. Tophat diers from other experiments in that it has the payload on the
top of the balloon. It was observing in the frequency range 150 to 630GHz and
is expected to give an estimate of the power spectrum up to multipole = 700.
More information can be found at http://topweb.gsfc.nasa.gov/.
Archeops is a balloon borne experiment which had its rst scientic ight ight
in Kiruna in Sweden, January 29th 2001. Archeops aims on large sky coverage to
reduce the sample variance in the power spectrum. The rst ight which lasted
7 and a half hours, covered more than 20% of the sky. Archeops uses a 1.5 meter
telescope similar to that which will be used for Planck. The telescope is spinning
during ight, producing a data set consisting of rings on the sky. The detectors
used are bolometers similar to those used for the Planck HFI. The frequencies ob-
served are 143, 217, 353 and 545GHz and the angular resolution will allow power
spectrum estimation up to = 1000. More information can be found on (Benoit
et al. 2001) or at http://archeops01.free.fr/main archeops/index english.html
Finally, I will mentioned a ground based experiment called the Very Small
Array (VSA). The VSA is a fourteen-element interferometer situated at Tener-
ife. It observes since September 2000 at the wavelengths 26 to 36GHz. It will
observe uctuations in the CMB in the multipole range = 150 to 1800. More
information can be found in (Taylor 2001)
2.2 The Analysis of CMB Data Sets
In this section I will discuss how one makes sky maps of a time stream of data
from a CMB experiment. In this part I used (Tegmark 1997a) as the main source
but other sources have also been use where this is indicated. Then the process
of removing galactic foreground contamination from the maps is discussed. The
resulting map is supposed to contain only the CMB signal and noise from the
detectors. The map is then ready for CMB power spectrum analysis which will
be the topic of section (2.3).
2.2.1 Map Making
The data stream from a CMB experiment, called the time ordered data (TOD),
is a vector d of temperature measurements from dierent points on the sky. The
telescope beam has a certain beam prole, so that the measurement in element
d
i
of the TOD is a convolution of the real underlying sky with this beam prole.
The beam prole is often assumed to be symmetric and usually a Gaussian since
this simplies further calculations. This is not always a good assumption as there
are also small contributions from sidelobes.
A sky map m can be written in terms of a pointing matrix P and a noise
vector n as
d = Pm+n. (2.1)
Here the TOD d is supposed to have N
d
elements and the map N
p
pixels. The
pointing matrix is a N
d
N
p
matrix containing the information about the sky
position (the pointing of the telescope) for each element of the TOD. In the gen-
eral case, m is the real underlying map and P is a complicated matrix containing
a complicated beamprole at each pointing. However if ones assumes the beam
prole to be simple and symmetric, then m can represent the beam smeared map
and P is a matrix with one element per row, giving the pixel number of each
element d
i
in the TOD.
The problem of map making, is the problem of nding the best estimate of
the map m. The linear solutions to the map can be written
m = Wd. (2.2)
One now needs to nd a matrix W which constructs a map m which has as
much as possible (and preferably all) of the information contained in d. The
map making method used for COBE had
W = (P
T
N
1
P)
1
P
T
N
1
, (2.3)
where N =< nn
T
> is the noise correlation matrix.
This choice of W has the advantage that the map making is lossless. That
is, the map has all the information that is contained in the TOD when assuming
that the data is Gaussian. It is shown in (Tegmark, Taylor, and Heavens 1997)
that the inverse of the sher information matrix F is a good approximation to
the covariance matrix for the best possible estimates of parameters from a data
set. For Gaussian data with zero mean and covariance matrix given by C, the
sher matrix is
F
ij
=
1
2
Tr(C
1
C
,i
C
1
C
,j
), (2.4)
where the , i subscript means derivative with respect to parameter i. It is shown
in (Tegmark 1997a) that the sher information matrix for the map and for the
TOD is the same when the map making procedure is given by equation (2.3).
In principle making the lossless map should be easy. Given equations (2.2)
and (2.3) the estimate of the map is
m = (P
T
N
1
P)
1
P
T
N
1
d. (2.5)
Knowing the TOD d, the noise properties of the detectors N and the pointing
matrix P the map could easily be computed by this formula. There are however
problems. The rst one being the matrix inverses in equation (2.5). The matrices
to inverse scale as N
p
N
p
. For the forthcoming CMB experiments N
p
will be of
the order of millions. Matrix inversion of a N N matrix takes N
3
operations
and for N 10
6
this becomes unfeasible with existing computers. The second
problem is that the noise properties are normally not known beforehand. They
must be estimated from the TOD together with the map.
Assuming that the noise correlation matrix N is known, (Wright 1996) pre-
sented a solution to the map making problem. His approach was to solve equation
(2.5) iteratively using the fact that given a vector b and a matrix M one can
nd the vector a = M
1
b iteratively knowing M without having to compute
M
1
. This makes map-making using equation (2.5) feasible as the scaling of this
is of the order N
d
N
iter
where N
iter
is the number of iterations needed to be
performed (of the order 10-20).
The next problem to be dealt with is the problem of estimating the noise
N from the TOD together with the map. I will rst briey review the typical
properties of noise in detectors used for CMB experiments. Detector noise is
usually a sum of white noise which is noise having the same amplitude on all
scales along the time stream and 1/f noise which is having increasing amplitude
for lower frequencies. If one represents the noise at position j in the time stream
as n
j
, then the Fourier transform of the noise along a time stream can be written
as,
n
k
=
j
n
j
e
2ikj/N
d
. (2.6)
The power spectrum of the noise is then dened as
P(k) =< n
k
n
k
>=
jj
< n
j
n
j
> e
2ik/N
d
(jj
)
. (2.7)
The typical power spectrum represented as a sum of white noise and 1/f noise
can be written as (in terms of frequency f)
P(f) = A
_
1 +
_
f
knee
f
_
_
, (2.8)
where A, f
knee
and are parameters dierent in each experiment. The frequency
f
knee
is the frequency where the white noise begins to dominate over 1/f noise.
As is described in (Natoli et al. 2001), the noise correlation matrix N can
easily be found when knowing the power spectrum and vice versa. If one assumes
that the noise is stationary, i.e. that it has the same properties along the whole
time stream, one can write
N
ij
=< n
i
n
j
>= ([i j[). (2.9)
Clearly N is a Toeplitz matrix. As discussed in (Natoli, de Gasperis, Gheller, and
Vittorio 2001), N can be well approximated by a circulant matrix if vanished
for some length [i j[ > N
< N
d
. The Fourier transform of a circulant matrix
is diagonal, so one can write
N
1
Q
1
Q, (2.10)
where = DiagP(f
1
), ...P(f
N
d
/2
. The dagger denotes complex conjugation
and Q is given by
Q
jk
=
e
ijf
k
2N
d
. (2.11)
(See (Natoli et al. 2001)). This establishes the connection between the power
spectrum of the noise and the noise correlation matrix N.
I will now discuss the available methods for estimating the map and the noise
power spectrum (and thereby N) from the TOD of an experiment. In (Ferreira
and Jae 2000) an iterative approach was found which for each step in an iterative
scheme one nds a solution to equation (2.5) and to the noise power spectrum.
Then for each step these two solutions are improved until convergence is reached.
As discussed in the paper, the method is not feasible for experiments with a very
large number of pixels. In (Prunet, Nettereld, Hivon, and Crill 2000; Dore et al.
) a similar iterative method is presented which is faster and therefore feasible
also for big experiments.
Finally another simple and fast method was presented in (Natoli, de Gasperis,
Gheller, and Vittorio 2001). I will briey outline this method. Their approach
is basically the approach of (Wright 1996). They rst estimate N from the time
stream and then once N is known, equation (2.5) can be solved iteratively using
a conjugate gradient method. The noise correlation matrix N is found by the
use of a Maximum-Likelihood (ML) technique which will be described in more
detail in section (2.3.1). The method says that given the pure noise data vector
n, the best estimate of the noise power spectrum P(f) is given by the P(f) which
minimizes the likelihood
L =
e
1
2
n
T
N
1
(P(f))n
2 det N
. (2.12)
Finding a good parameterization of P(f), one can easily minimize this function,
using the approximation equation (2.10). The only problem now is that one
doesnt have the pure noise data stream n, only the signal+noise data stream d.
It is shown in (Natoli et al. 2001), that using an expression for the noise given
by
n = d P(P
T
P)
1
P
T
d, (2.13)
one gets results in good agreement with the iterative methods. The last term
in this equation can be understood as the signal: P
T
d makes a map from the
TOD, (P
T
P)
1
divides each pixel with the number of values added in the pixel
and P makes a time stream from the map. Using this as the noise vector one
can solve the maximum-likelihood problem to nd the noise power spectrum and
thereby N. When Nis found the map can be found iteratively as described above.
RING 1 RING 2
Figure 2.5: The gure illustrates high and low frequency noise (lower and upper
sine-curve respectively) and 2 scan rings each scanned 5 times. The purpose of
the gure is to show the origin of striping. This is explained in the text.
Another issue strongly related to map-making is the problem of striping. Be-
cause of 1/f noise the scan path will be visible as stripes in the map. Why this
is so is easy to see from gure (2.5). I take the Planck scanning strategy as
an example. Planck will be scanning on circles and each circle will be scanned
several time before the next circle is scanned. On the gure, the upper cosine
curve shows low frequency noise and the lower shows high frequency noise. The
lower line shows 2 rings with 5 scans in each ring. Obviously by averaging all the
scans in a ring, the high frequency noise will be averaged out. However the low
frequency noise will clearly not. In the gure, the rst ring will clearly have a
higher noise level than the second. Each ring get a dierent average noise level
because of the low frequency noise. This is visible in the map as stripes along
the scanning rings. These osets can be partially removed (called destriping) by
comparing the observed temperatures at the points where the rings are crossing.
Obviously the more crossing points there are between the rings, the more ecient
can the stripes be removed. In (Bersanelli et al. 1996; Delabrouille 1998; Maino
et al. 1999) destriping algorithms for the Planck surveyor are discussed.
2.2.2 Foregrounds
The are several other sources in the universe apart from the CMB which are
emitting radiation on the same wavelengths as the CMB. In order to estimate
the CMB power spectrum, one must either observe in areas of the sky without any
of these foregrounds, or one must somehow remove them from the data. There
are 5 main types of foregrounds which must be dealt with in order to study the
CMB in contaminated areas of the sky. Galactic dust emission, free-free and
synchrotron radiation are contaminants from our own galaxy. In addition there
is the radiation from extra galactic point sources and the SZ eect from clusters
of galaxies (the thermal AND kinetic) (see section (1.4.3)). Thanks to dierent
spectral behavior of these foregrounds and the CMB, one can in principle sep-
arate the CMB from the foreground contaminants. I will now discuss in detail
some of these foregrounds. When no other reference is given, the information
comes from (Bersanelli et al. 1996; Kogut et al. 1996; Bouchet and Gispert 1999;
Hobson, Jones, Lasenby, and Bouchet 1998).
In interstellar space in our own galaxy there are dust grains (consisting of
graphite and silicates) heated by the surrounding stars and thereby emitting elec-
tromagnetic radiation. The spectrum of the dust emission from the galaxy was
measured by the FIRAS instrument on the COBE satellite (see section (2.1.1))
with a 7
FWHM beam. Other balloon borne experiments have measured the

galactic dust spectrum at higher resolution up to 30 arcminutes. At high galactic
latitudes (away from the galactic plane) it has been found that the dust can be
described well with a single dust component at 18K with an emissivity which
goes as
2
where is the frequency of radiation. In the direction of the galac-
tic plane, there seems to be another component with a temperature of 21K and
spectral dependency
1.4
. Maps of galactic dust have been made by DIRBE (an-
other COBE instrument, see section (2.1.1)) at a resolution of 42 arc minutes
and by the Infrared Astronomical Satellite IRAS at 4 arcminutes. The high reso-
lution images from IRAS has revealed an angular power spectrum of the dust of
C

3
. At high galactic latitudes (outside of the galactic plane > 30
, dust is
the dominating foregrounds component at frequencies above 100GHz. Recently
there have been detection of emission from what appears to be spinning dust
grains. This emission seems to be dominant at frequencies below 30GHz.
Another galactic foregrounds component operating at lower frequencies is the
free-free emission from ionized hydrogen H2. The spectrum of the free-free emis-
sion is well known to be
0.16
, but good maps of interstellar H2 is lacking.
Attempts to map the free-free emission directly was made using data from the
COBE DMR instrument (Bennett et al. 1992; Bennett et al. 1994). Unfor-
tunately the data was so noisy that only the quadrupole could be measured.
Other attempts to map H2 has been observation of H
emission (Reynolds 1984;

Reynolds 1992) but these experiments suered from undersampling and selection
biases. For this reason, a correlation which is detected between free-free emission
and dust emission is used. Dust is correlated with neutral hydrogen H1 which
is again correlated with ionized hydrogen H2. The dust-free-free correlation has
been detected in correlations of data between the DMR and DIRBE instruments
and between data from the Saskatoon (Tegmark et al. 1997) experiment and
DIRBE at smaller angular scales (Oliveira-Costa et al. 1997). For this reason,
the same maps used to map dust (IRAS/DIRBE) are also used for free-free emis-
sion. The angular power spectrum of free-free emission is assumed to be the
same as for dust. Outside of the galactic plane, free-free emission is the domi-
nant galactic foreground contaminant in the frequency range 5GHz to 100GHz.
The third galactic source of microwave radiation is synchrotron emission re-
sulting from the acceleration of cosmic ray electrons in the galactic magnetic eld.
Due to the varying magnetic eld strength in the galaxy, the spectral index of
synchrotron emission is varying. Observations at 408GHz show that the spectrum
of synchrotron emission goes as
where is between 2.7 and 3.1. Observa-

tions at higher frequencies indicate an index of = 0.9 for the frequency range
of interest for CMB experiments. For synchrotron emission, the maps used are
the 408MHz map by (Haslam et al. 1981) and the 1420MHz map by (Reich and
Reich 1988). For CMB experiments, the data of synchrotron emission in these
maps are extrapolated to the higher frequencies. These maps have an angular
resolution of 0.85
and 0.6
FWHM respectively. There is no data available at

higher resolution. As these maps seem to indicate that the power spectrum falls
of as C

3
, this is the assumption usually adopted. This is the same power
spectrum observed for dust at small scales. Apparently synchrotron emission is
the galactic contaminant for which one has the least information. Synchrotron
radiation is together with free-free important at frequencies below 100GHz and
is dominant below 5GHz.
The CMB maps will also be contaminated by extra galactic point sources.
These can be AGNs (Active galactic nuclear), radio galaxies, quasars or BL Lac
objects in the radio domain of the spectrum. In the Far-IR (Far-Infrared) do-
main, the dust emission from dust dominated infrared galaxies is present. The
spectrum of these objects is dependent on redshift. Unfortunately the spectrum
and the evolution of these objects are not well known. In (Toolatti et al. 1998),
a last update on the observations and evolutionary models is found. They con-
clude that on Planck resolution, the radio sources is the dominating point source
contaminant on frequencies < 100GHz, the Far-IR sources are dominating at
> 200GHz and in the intermediate range both types are comparable. They do
however point out, that even in the most pessimistic models, the amplitude of
uctuations due to point sources is well below the amplitude of the CMB at the
frequencies 100200GHz. In (Hobson et al. 1999), it is shown how these models
can be adopted to remove point sources in CMB data.
Another source of extra galactic point sources is the SZ-eect from clusters
of galaxies. The change of the CMB spectrum due to the thermal motion of
electrons in the clusters (thermal SZ eect) and the bulk motion of the cluster
(kinematic SZ eect) is also contaminating the underlying CMB. Fortunately, the
SZ eect has a certain spectral signature that makes it easy to identify. According
to (Zeldovich and Sunyaev 1969), the relative temperature change of the CMB
due to the SZ eect is given by
T
T
= y
_
xcoth
x
2
4
_
, (2.14)
where x = h/kT, h is Plancks constant, k is Boltzmanns constant and y is the
Compton y parameter measuring the line of sight density of electrons given as
y =
_
s
0
kT
e
mc
2
ds, (2.15)
where T
e
is the electron temperature, m is the electron mass and s is the Thomp-
son optical depth s =
_

T
e
dl. Here
T
is the Thompson cross section and
e
is
the electron density. Figure (2.6) shows the spectral signature.
For the kinetic SZ eect, the spectrum is constant (Sunyaev and Zeldovich
1980)
T
T
=
v
r
c
s, (2.16)
where v
r
is the radial velocity of the cluster. The kinematic eect is typically an
order of magnitude lower than the thermal eect.
Figure 2.6: The spectral change in the CMB due to the thermal SZ eect.
Finally I will review some of the standard method of separating the dierent
components of the CMB. One can dene a function f( n, ) which is the intensity
in the direction n at frequency . This intensity consists of contributions from
dierent components like the CMB, galactic emission or the SZ eect. Assuming
that the dierent components can be factorised into a spatial part x( n) and a
spectral part s() one can write the intensity as
f( n, ) =
np
j=1
s
j
()x
j
( n), (2.17)
where the sum goes over the n
p
dierent components. For an experiment with n
c
frequency channels one can assign the observation in the ith channel to the ith
element of a vector y given as
y( n) = Px( n) +n( n). (2.18)
Here x( n) is an n
p
element vector, the elements being the x
j
( n) above for each
channel, n( n) is an n
p
element vector containing the noise in each channel and P
is a n
c
n
p
matrix assigning contributions from dierent components to dierent
2.3 Power Spectrum Estimation 62
channels. The elements of P are given by
P
ij
=
_

0
t
i
()s
j
()d, (2.19)
here t
i
() is the spectral transmission of channel i.
The problem of component separation now becomes the problem of nding
the vector x given the observations y and general information about the dierent
components. In other words, one is looking for a n
p
n
c
matrix W satisfying
x( n) = Wy( n). (2.20)
There have been dierent approaches to nd the matrix W. One of the standard
ones has been Wiener ltering. In this method ones tries to minimize the quantity
2
=< [Wy x[
2
> . (2.21)
As this technique is to time consuming in pixel space, one normally transforms
all components to spherical harmonic space for the Wiener ltering. The method
is described in more detail in e.g. (Bersanelli et al. 1996; Tegmark and Efs-
tathiou 1996; Prunet et al. 2001). Another method is the Maximum Entropy
Method (MEM) (Hobson, Jones, Lasenby, and Bouchet 1998; Stolyarov, Hobson,
Ashdown, and Lasenby 2001). These methods uses Bayes theorem (this is de-
scribed in more detail in section (2.3.1)) to maximize the posterior probability
(the probability of the theory given the data) expressed in terms of the likelihood
(the probability of the data given a theory) and an entropic prior. In this case
the Gaussian likelihood can be written using equation (2.18)
L e
1
2
(yPx)
N
1
(yPx)
, (2.22)
where N =< n
T
n > is the noise correlation matrix. Other methods include
wavelet techniques (Tenorio 1999), neural networks (Baccigalupi et al. 2000) and
Fast Independent Component Analysis (FastICA) (Maino et al. 2001). All these
techniques rely on good spectral and spatial information about the foregrounds
which as described above, is not always available.
2.3 Power Spectrum Estimation
In this section I will describe how the angular power spectrum of the CMB can
be extracted from the maps where the foregrounds are removed. I will discuss the
standard approach which is the maximum-likelihood method. I will describe why
this method is not feasible for large data sets and why new and faster methods
are needed. I will describe some attempts to nd fast power spectrum estimation
methods. The rest of this thesis will discuss new methods which should make the
power spectrum analysis of huge data sets feasible.
2.3.1 Likelihood Estimation
The standard method for analysing CMB data is the maximum-likelihood method.
This is the method which gives the smallest error bars. I will now outline the
method in general and for CMB analysis. For more information, see e.g.(Parratt
1961).
The maximum-likelihood method (MLM) is based on Bayes Theorem. If
one considers two events A and B which have the probabilities P(A) and P(B)
respectively for occurring, then Bayes theorem connect the probability for A to
occur given that B has occurred P(A[B) and the probability for B to occur given
that A has occurred P(B[A). Bayes theorem can in this case be written as
P(A[B) = P(A)
P(B[A)
P(B)
. (2.23)
Bayes theorem is simply another way of writing the product theorem for proba-
bilities,
P(A, B) = P(A)P(B[A), (2.24)
which says that the probability for both A and B to occur, is just the product of
the probability for the occurrence of A times the probability for the occurrence
of B given that A has already happened. In data analysis one is interested in
the probability for a hypothesis H to be true given the events (data) d from an
experiment, P(H[d). With these parameters Bayes theorem can be written
P(H[d) = P(d[H)
P(H)
P(d)
. (2.25)
The function P(H[d) is called the posterior probability since it expresses the be-
lief in a hypothesis after the data from an experiment has been considered. The
function P(H) is called the prior since it expresses the prior belief in the hypoth-
esis before the experiment was done. The function P(d[H) is called the likelihood
and is the probability of getting the data d given that the hypothesis H is true.
Finally, the factor P(d) is independent of the hypothesis and is just a normalisa-
tion factor.
When an experiment has been done which has given the data set d, one would
like to nd which hypothesis H of several hypothesis has the best t to the data.
Expressed dierently, one would like to nd the hypothesis H for which the prob-
ability of the hypothesis given the data P(H[d) is highest. Looking at equation
(2.25) this means that one has to calculate the product of the likelihood and the
prior for the dierent hypothesis available and nd the hypothesis for which this
product is highest. This is the method of Maximum likelihood (ML).
As an example I will use an experiment where one has measured a set of
numbers given as a vector x. The hypothesis is that this set of numbers has
a multivariate Gaussian distribution. The mean < x > and correlation matrix
C =< x
T
x > are given by the hypothesis as a function of some parameters
which one wants to estimate from the data. I will use a at (constant) prior (as
in the rest of this thesis) meaning that there is no a priori values of which
are known to be better or more probable than others. In this case one has to
maximize the likelihood given as a multivariate Gaussian
L() =
e
1
2
(x<x>())
T
C
1
()(x<x>())
2 det C
. (2.26)
It often simplies to take the logarithm of the likelihood and minimize the quan-
tity
L() = 2 log L() = (x < x > ())
T
C
1
()(x < x > ()) + log det C.
(2.27)
This quantity can be minimized using dierent available computer algorithms.
Which algorithm to use depends on whether one has knowledge about the deriva-
tives of the likelihood. If one knows the rst (and possibly also the second)
derivative, the minimization procedure will be quicker.
Some words have to be said about the estimate of that one gets from
the MLM. As discussed in (Tegmark, Taylor, and Heavens 1997), is the best
unbiased estimator of . It is unbiased in the sense that goes asymptotically
to the true value of when many likelihood estimations of from dierent
experiments are made. It is the best in the sense that no other method gives
smaller error bars. The Cramer-Rao inequality says that
i

_
(F
1
)
ii
. (2.28)
Here
i
=
_
(
i
<
i
>)
2
and F is given by the Fisher information matrix
F
ij
=
1
2
<
L
j
> . (2.29)
For the ML estimator, the Cramer-Rao inequality is an equality and for this rea-
son no other method can have smaller error bars.
For CMB analysis, likelihood analysis is very attractive for the reasons given
above. It gives an unbiased estimate of the parameters one is estimating for and
it gives the smallest error bars. Also, the objects that one gets from the experi-
ments, the sky temperature in a given pixel or the Spherical Harmonic transform
a
m
are (with Gaussian initial conditions assumed) Gaussian distributed so a sim-
ple likelihood of the form given in equation (2.26) can be used. The parameters
that one can estimate for could be the power spectrum coecients C
or directly
the cosmological parameters
0
,
b
, h etc.
I will now assume that I have a sky map from a full sky experiment with
foregrounds removed. The aim is to nd the best estimate of the power spectrum
C
. The datavector for the likelihood will now be the pixels of the CMB map
arranged into a datavector d = s + n where s is the CMB signal and n is
the noise. The monopole (average) has been subtracted from the map so that
< d >= 0. Before one can formulate the likelihood (equation 2.26), one also
needs the correlation matrix C =< d
T
d >= S + N (where S =< s
T
s > and
N =< n
T
n >) expressed through the parameters C
for which one wants to nd

an estimate. This can be done using the relation
T( n
i
) =
m
a
m
Y
m
( n
i
), (2.30)
where n
i
is the unit vector in the direction of pixel i. One then gets for S
ij
< d
i
d
j
> = < T( n
i
)T( n
j
) >, (2.31)
=
< a
m
a
m
> Y
m
( n
i
)Y
m
( n
j
), (2.32)
=
m
Y
m
( n
i
)Y
m
( n
j
), (2.33)
=
2 + 1
4
C
(cos(
ij
)), (2.34)
where
ij
is the angular distance between pixel i and j. The following two
relations were used
< a
m
a
m
>= C

mm
, (2.35)
as discussed in section (1.4.2) and
m
Y
m
( n
i
)Y
m
( n
j
=
2 + 1
4
P
(cos(
ij
)). (2.36)
How the noise correlation matrix N can be found was discussed in section (2.2.1).
For a given power spectrum C
one can now calculate the log-likelihood

L(C
) = d
T
C
1
d + log det C. (2.37)
The MLM method now says that one shall minimize this likelihood with respect to
C
. The set of parameters

C
which minimizes equation (2.37) is the ML estimate.

This was the ideal case. The rst problem with the MLM is the size of the
data set. The MAP and Planck satellites will produce sky maps with 10
6
10
7
pixels. This means that the correlation matrix C has say 10
7
10
7
elements.
Finding the inverse and determinant of a NN matrix takes of the order N
3
op-
erations. On current computers, these operations take of the order a few seconds
for N = 1000. To make a very optimistic estimate, lets say the the operations
take 1sec for N = 1000. then for N = 10
7
this will take 10
12
sec 32000years.
The problem of nding the inverse and determinant of the correlation matrix
could be solved if the matrix had some symmetries making it faster to invert.
One possibility to explore would be to go to spherical harmonic space and use
the coecients a
m
in the datavector instead of the pixels in the map. In theories
with Gaussian initial conditions, these a
m
are also Gaussian distributed with a
zero mean. With such a datavector, the correlation matrix would have the form
S
m,
m
=< a
m
a
m
>= C

mm
. In this case the correlation matrix would
be diagonal and the inversion would take N operations. Unfortunately the noise
correlation matrix does not simplify in spherical harmonic space and so the prob-
lem of doing a complete maximum likelihood analysis remains unsolved.
Another problem with the likelihood formalism arises when only parts of the
sky is covered. From the discussion about foregrounds it should be clear that
it can be dicult to remove all the foreground contaminants from the map. In
this case, parts of the sky has to be cut out. When the integration is no longer
done over the whole sky, the orthogonality of spherical harmonics is destroyed
and the a
m
on the cut sky are no longer orthogonal, i.e. the correlation matrix
in spherical harmonic space is no longer diagonal. In (G orski 1994) a method
to orthogonalize the spherical harmonics on the cut sky is presented. I will now
briey outline the method.
Following (G orski 1994), one can dene a scalar product of two functions f
i
and g
i
on a pixelised sphere as
< fg >
(fullsky)
=
pix
i
f
i
g
i
. (2.38)
Dening a vector y which elements are the spherical harmonic functions Y
m
for
each pair m, one gets
< yy
T
>
(fullsky)
I (2.39)
by the orthogonality of spherical harmonics (this is not an identity here since a
pixelised sphere is considered). On the cut sky however, one gets
< yy
T
>
(cutsky)
= W (2.40)
By dening a new set of harmonics dened as
= y, (2.41)
where is the inverse L
1
of the factors of the Cholesky decomposition of the
coupling matrix W = LL
T
. With these new harmonics one can now dene a new
set of harmonic coecients. Letting a be a vector of a
m
components, one can
dene the vector c of new coecients as
c = L
T
a, (2.42)
which correlation matrix < cc
T
> is again diagonal.
Estimating the power spectrum using the full MLM is clearly not feasible
so other approximate methods have to be sought. One method was developed
for the MAP experiment by (Oh, Spergel, and Hinshaw 1999). By assuming
uncorrelated axis symmetric noise, they estimate the maximum of the likelihood
using conjugate gradient techniques. This takes of the order N
2
pix
operations
instead of the N
3
pix
required for the exact likelihood solution. Their starting point
is the second order Taylor expansion of the log-likelihood around the minimum
at the estimate

C
L L[

C
] +
L
C
[

C
](C
) +
2
L
C
[

C
](C
)(C
), (2.43)
where [

C
] means that the function is supposed to be taken at the minimum

where C
=

C
for all . The derivatives of the log-likelihood are

L
C
= d
T
C
1
P
C
1
d +Tr(C
1
P
), (2.44)
2
L
C
= 2d
T
C
1
P
C
1
P
C
1
Tr(C
1
P
C
1
P
), (2.45)
P
=
C
C
. (2.46)
The Fisher matrix (equation (2.29)) is the half of the expectation value of this
second derivative
F
=
1
2
Tr(C
1
P
C
1
P
). (2.47)
The approach in (Oh, Spergel, and Hinshaw 1999) is to minimize the log-likelihood
by iteratively nding the zero point of the rst derivative, using the second deriva-
tive approximated by its expectation value given by the sher matrix. Calculating
the rst and second derivatives of the log-likelihood consist of inverting the cor-
relation matrix C which needs to be avoided as discussed above. They solve this
problem by means of the following techniques
They nd the term C
1
d iteratively using conjugate gradient techniques
which do not require the inversion of C. However a good guess for C
1
is
needed for the iteration to converge. They nd this by inverting a simplied
form of the correlation matrix.
The trace Tr(C
1
P
) is found using the fact that

< d
T
C
1
P
C
1
d >= Tr(C
1
P
). (2.48)
In this way, the trace term can be calculated using Monte Carlo simulations.
The sher matrix (which is not needed exactly in order to converge to the
minimum) is approximated by using the preconditioner for C
1
in the rst
step.
Another way to increase the speed of the likelihood estimation is by means
of approximate likelihood expressions (Bond, Jae, and Knox 2000; Bartlett,
Douspis, Blanchard, and Dour 2000). In (Bond, Jae, and Knox 2000) this
is motivated by the fact that the likelihood is not a Gaussian function of the
power spectrum C
. For the very simplied scenario with full sky coverage and
uniform white noise the likelihood can be written in terms of the the coecients
(
= ( + 1)C
/(2) as
L =
(2 + 1)
_
ln((
B
2
+^
) +
B
2
+^
_
, (2.49)
where B
accounts for the beam, ^
is the noise power spectrum dened in the

same way as (
and

(
is the measured power spectrum. As expected this is not

Gaussian in (
which is clearly seen by the fact that the second derivative is not
a constant in this quantity. In (Bond, Jae, and Knox 2000) they construct a
new quantity Z
= ln((
+ x
) with x
= ^
/B
2
which is approximated to be
normally distributed. The approximate likelihood then becomes
L =
M
(Z)
, (2.50)
where M
(Z)
= ((
+ x
)M
(C)
((
+ x
) and M
C
is the inverse covariance matrix.
The idea is now to keep this simplied form of the likelihood for a more com-
plicated case with partial sky coverage and non-uniform noise. In this case the
parameter x
will have to be changed. Unfortunately the calculation of x
will in
the general case scale as N
3
pix
.
A similar approximation was used by (Wandelt, Hivon, and G orski 2000). In
that approach, the so called pseudo-C
(

C
) were approximated to be Gaussian

and uncorrelated and the corresponding likelihood-ansatz was used. This method
will be described in more detail and extended in chapter (4).
2.3.2 Quadratic Estimators
Some authors have avoided the ML estimation and instead attempted to use
quadratic estimators. Quadratic estimators mean estimators of the power spec-
trum C
which are quadratic in the data. In (Bond, Jae, and Knox 1998),
the Taylor expansion of the likelihood was used to make a quadratic estimator
which is calculated using iterations. Starting with an initial guess for the power
spectrum, the correction at each step can be calculated using
C
1
2
F
1
Tr
_
(dd
T
C)(C
1
P
C
1
)
_
, (2.51)
where F
is the Fisher matrix. As this is a Taylor expansion about the maximum-

likelihood solution, the expression becomes more and more correct, the closer one
is to the solution. For this reason the iteration converges to the maximum-
likelihood solution. But clearly N
3
pix
operations are needed to calculate this
quantity. A similar quadratic estimator with optimal error bars was found by
(Tegmark 1996; Tegmark 1997b).
In (Dore, Knox, and Peel 2001) the quadratic method (equation (2.51) was
extended to yield a faster power spectrum estimation method. Their method
involves splitting the map into several non-overlapping pieces having a size for
which power spectrum estimation using the quadratic estimator (equation 2.51)
is feasible. To obtain the power spectrum for large angular scales, the map is
coarsened and the quadratic estimator is used on the coarsened map. In the end,
all the dierent estimates are averaged in an optimal way.
Another quadratic estimator using correlation functions was suggested by
(Szapudi et al. 2001) and further developed by (Szapudi, Prunet, and Colombi
2001). The idea is to calculate an unbiased quadratic estimator of the pixel-pixel
correlation function in bins
(cos ) =
ij
f
ij
(d
i
d
j
N
ij
), (2.52)
where f
ij
determines which pixel pairs belong to the bin (can also be used for
noise weighting). Then the power spectrum is integrated from the correlation
function. The error bars in the method are found using Monte Carlo simulations.
The error bars are found to be up to 10% larger than the optimal ML error bars.
The method has been tested on uncorrelated noise, but it should in principle also
work with noise correlations present.
Finally an unbiased quadratic estimator based on pseudo C
was presented in
(Hivon et al. 2001). In this paper, the relation between

C
and the full sky C
was derived and inverted giving an equation for C
expressed by the

C
. These
relations will be discussed further in chapter (4). In this method, the noise corre-
lation matrix and the variance of the estimates was computed using Monte Carlo
simulations. The method was tested on an asymmetric patch on the sky with
correlated noise and the error bars turned out to be almost optimal.
2.3.3 Two New Methods for Power Spectrum Estimation
This thesis describes two new methods for fast power spectrum estimation. The
main goal of a CMB experiment is usually the estimation of cosmological pa-
rameters, but the estimation of parameters from a CMB map is usually done
with an intermediate step where the angular power spectrum is estimated. As
discussed in (Bond, Jae, and Knox 1998; Bond, Efstathiou, and Tegmark 1997)
it has several advantages to estimate the cosmological parameters from the power
spectrum rather than directly from the map. A near-degeneracy of cosmological
parameters makes it harder for search algorithms to nd the minimum of the
likelihood in cosmological parameter space. The power spectrum space is much
simpler as the dierent C
s are only weekly dependent on each other. This makes

it signicantly simpler to search for the likelihood minimum in C
space. Once
the power spectrum is found, it is now easier to nd the log-likelihood minimum
of the cosmological parameters with the power spectrum as input data. Also, the
joint analysis of data from several CMB experiments is easier using the power
spectra from each experiment instead of the maps.
In chapter (3) a new method for extracting the power spectrum directly from
the time stream of an experiment without doing the intermediate map-making
step will be discussed. This method is developed for an ideal experiment (station-
ary noise, no foregrounds) which is scanning on rings. It is an exact likelihood
maximization algorithm which takes advantage of the signal and noise symmetries
in experiments like MAP and Planck which are scanning on rings. The scaling
is N
2
where N is the number of elements in the time stream (after averaging all
the scans in each scan ring reducing the length of the time stream signicantly).
One of the advantages with the method is that it takes a general non symmetric
beam pattern and correlated noise into account at no extra cost.
In chapter (4) another method which is an extension of the method in (Wan-
delt, Hivon, and G orski 2000) will be discussed. The pseudo-C
s from a map
which is multiplied with a general axis-symmetric window is used as data in a
ML estimation. The chapter describes the extension of Gabor Transforms to
the sphere. Gabor Transforms or Windowed Fourier Transforms are just Fourier
Transforms of data which has been multiplied with a window function to smooth
the data at the edges or to increase S/N. The extension of the method to analyse
several patches on the sky with dierent windows is discussed. The method is
fast and will in principle allow analysis of huge data sets as those coming from
MAP and Planck. In chapter (5) the method is extended to polarisation.
Chapter 3
Fast Exact Power Spectrum
Analysis for a Special Type of
Scanning Strategies
In this chapter I will present work done under supervision of Ben Wandelt. A
new method to estimate the power spectrum from a CMB experiment using the
maximum-likelihood method directly on the data time stream instead of the sky
map will be presented. The method works for experiments scanning on rings
(like MAP, Planck and Archeops) where the data time stream has symmetries
which makes exact likelihood analysis possible in O(N
2
) operations, N being the
number of elements in the time stream.
The method uses a technique presented in (Wandelt and G orski 2000) for fast
convolution of two functions on the sphere. For this reason I will make a sum-
mary of this paper in section (3.1). Some of the techniques and notation in this
paper is used in section (3.2) for the power spectrum estimation. In the paper
it is described how one can simulate the time stream from an experiment with
an assymetric beam prole and far side lobes. The task of convolving a general
beam prole with a given sky is speeded up by doing the convolution in spherical
harmonic space.
For experiments which are scanning on rings, the correlation between rings
will only depend on the distance between the rings. This is true both for noise
and signal, making the total correlation matrix for the time stream block circu-
lant (Wandelt 2000). In Fourier space, such a matrix is block diagonal allowing
fast inversion of the matrix. In section (3.2) I will describe the method of ex-
tracting the angular power spectrum with the MLM exploiting this symmetry of
the correlation matrix. This method naturally deals with general beam patterns
and side lobes which so far no other power spectrum estimation method can han-
dle. An example will be presented. The method and its results are published in
3.1 Fast Fourier Space Convolution 72
(Wandelt and Hansen 2000)
3.1 Fast Fourier Space Convolution
This section is a summary of the paper (Wandelt and G orski 2000). The results
and denitions here will be used in the next section. In (Wandelt and G orski
2000) a fast method to make simulated time streams of a general CMB exper-
iment is presented. To test the dierent steps of the data analysis pipeline of
a CMB experiment one needs to simulate the experiment. Generally when the
CMB telescope is looking in a certain direction, it does not only observe the CMB
temperature in that given direction. There are also contributions from other di-
rections. Usually one approximates this beam pattern to be a 2D Gaussian cen-
tered in the direction in which the telescope is pointing. In general however, this
beam pattern will be dierent and often not axissymmetric causing distortions
of the CMB patterns. In addition, if the instrument allows diraction of stray
light into the detectors, the observed CMB temperature in one direction might
have contributions from completely other directions. Because of high intensity
radiation from the galactic plane (see section (2.2.2)) or reected radiation from
planets in the solar system this can contaminate the CMB data and should be
taken into account in the data analysis.
In the following, rotations and directions on the sky will be given by the Euler
angles (
2
, ,
1
). These angles describe a right handed rotation in a xed coor-
dinate system by the angles
1
, and
2
about the axis z, y and z respectively
in that order. As an example one can consider a point at the north pole. After
a rotation (0
, 90
, 90
) the point will be in the direction of the y axis.

Following (Wandelt and G orski 2000), one can consider two functions on the
sphere b( n) and s( n) which can be the beam prole (with all contributions in-
cluding sidelobes) and the sky of a CMB experiment. The observed temperature
T(
2
, ,
1
) in the direction that a beam centered at the north pole will have
after the rotation (
2
, ,
1
) can be written
T(
2
, ,
1
) =
_
d n
_
D(
2
, ,
1
)b
_
( n)s( n). (3.1)
This is a convolution of the sky with the beam rotated into the correct position.
The rotation operator

D is described in appendix (A) and rotates the function
with the given Euler angels. So given a sky and a beam prole centered at the
north pole, this integral gives the observed temperature in a given direction. In
a CMB experiment the beam will take a certain path on the sky as a function of
time t. So the direction in which the beam is pointing is given as a function of
time as (
2
(t), (t),
1
(t)). This is called a scan path.
beam
Figure 3.1: The structure of a basic scan path described in the text. The scan
rings are all at an angular distance
E
from the center C and the angular radius
of the rings is . The angle
E
is the azimuthal angle of the center of a given
ring and gives the position in a specied ring.
To do the integral (equation (3.1)) is very time consuming. Say that the ex-
pansion of the sky and beam in spherical harmonics can be truncated at the mul-
tipoles L
s
and L
b
respectively. Then dening L = min(L
s
, L
b
), the scaling of the
integral (equation (3.1)) is O(L
2
) operations for each position (
2
(t), (t),
1
(t))
of the beam. The temperature has to be evaluated at at least O(L
3
) such posi-
tions giving a total scaling of the convolution of O(L
5
) operations.
The aim of (Wandelt and G orski 2000) is to increase the speed of this convo-
lution by going to spherical harmonic space. Before doing this step the basic scan
path should be introduced. This is the scan path of an experiment scanning on
rings such as the MAP and Planck satellite. This basic scan path is the basis of
the fast convolution method and also of the power spectrum estimation technique
presented in the next section.
The basic scan path consists of a ring of rings on the sphere. This is illus-
trated in gure (3.1). Given a certain centre axis (the centre axis intersects the
sphere in the point C on the gure), the experiment scans on rings of angular
radius with the centres situated at an angular distance
E
from the point C.
A given ring can be characterized by the azimuthal angle (about C)
E
of its
centre and a certain point on the ring can be given by the angle which goes
from 0 to 2 along the scanning ring. The angle is zero at the point on the ring
which is furthest away from C and increases in the right handed direction about
the outward normal from the centre of the sphere. Finally as a generalization
of the basic scan path, an angle can describe a right handed rotation of the
beam itself about the normal from the centre of the sphere in the direction of the
beam. For a certain experiment,
E
and can be xed so that the timestream is
written as a function of the remaining angles as T(
E
, , ). As discussed in the
next section this will not always be the case, but will be assumed in the rest of
the thesis.
With these denition, the rotation in equation (3.1) can be factorised into
two rotations
D(
2
, ,
1
) =

D(
E
,
E
, 0)

D(, , ). (3.2)
This can be thought of in the following way. With the beam pointing in the direc-
tion of the north pole, the beam is rst rotated into the position in the scanning
ring, such that the north pole is now the centre of the scanning ring. Then the
centre of the scanning ring is moved into its correct position on the sphere.
A general function s( n) on the sphere can be expanded in spherical harmonics
as
s( n) =
Ls
=0
m=
s
m
Y
m
( n). (3.3)
With this expansion, equation (3.1) can be written
T(
E
, , ) =
s
m
_
D(
E
,
E
, 0)

D(, , )b
_
_
d nY
m
( n)Y
m
( n),
(3.4)
where the limits on the and m sums are taken to be the ones in equation (3.3).
By the orthogonality of spherical harmonics this can be written as
T(
E
, , ) =
MM
s
m
D
mM
(
E
,
E
, 0)D
MM
(, , )b
M
, (3.5)
where the rotation operators are written in spherical harmonic space as described
in appendix (A). As given in the appendix the spherical harmonic rotation coef-
cients can be written as
D
mm
(
2
, ,
1
) = e
im
2
d
mm
()e
im
1
, (3.6)
where the d
mm
() factors can be calculated by recursion. This gives
T(
E
, , ) =
MM
s
m
b
M
d
mM
(
E
)d
MM
()e
i(m
E
+M+M
)
. (3.7)
Now dening the 3D Fourier transform of the time stream as
T
mm
m
=
1
(2)
3
_
2
0
d
E
ddT(
E
, , )e
i(m
E
+m
+m
)
, (3.8)
one gets after doing the summations (which give delta functions)
T
mm
m
=
s
m
d
mm
(
E
)d
m
()b
m
. (3.9)
The timestream is easily found from the inverse Fourier transformation giving
T(
E
, , ) =
L
mm
=L
T
mm
m
e
i(m
E
+m
+m
)
. (3.10)
Using these formulae, the convolution of a sky with a general beam can be done in
O(L
4
) operations (actually even less as described in (Wandelt and G orski 2000)).
In the work in the next section, the basic scan path will be assumed. The
basic scan path is the path with = 0. In this case, the Fourier transform can
be written in terms of m and m
. The variable m
has to be summed over

T
mm
=
T
mm
m
(3.11)
=
s
m
d
mm
(
E
)X
m
(), (3.12)
3.2 Power Spectrum Estimation Using Scanning Rings 76
where
X
m
=
M
d
mM
()b
M
. (3.13)
The reason for the sum over m
is clear from equation (3.10),

T(
E
, , 0) =
L
mm
=L
T
mm
m
e
i(m
E
+m
+m
)
. (3.14)
This Fourier transformed time stream for a basic scan path will be used in
the next section for fast power spectrum estimation from the time stream.
3.2 Power Spectrum Estimation Using Scanning
Rings
As discussed in section (2.3.1), the problem of calculating the full likelihood of
a CMB sky map is the inversion of the pixel correlation matrix. Because the
CMB pixel correlation matrix has the nice property that it is only dependent
on the angular distance between two pixels (assuming statistical isotropy), the
correlation matrix in spherical harmonic space is diagonal. But the noise has
no symmetries on the sphere and for this reason the correlation matrix does not
have any properties that makes it easy to invert.
On the time stream however, the noise has a very nice symmetry. As dis-
cussed in section (2.2.1), the correlation of the noise between two elements d
i
and d
j
in the timestream is only dependent on the distance [i j[ (along the
timestream) between the two elements. In Fourier space this makes the correla-
tion matrix diagonal. But the signal does not in general have any symmetries on
the timestream.
4
3
2
1
4
3
2
1
4
3
2
1
4
3
2
1
4
3
2
1
p r
1
2
3
4
r
1
2
3
4
1 2 3 4
3 4 3 2
p
d
C
pr
pr,pr
Figure 3.2: The gure shows the structure of a vector d
pr
as a column vector
and a correlation matrix C
pr,p
r
for a data set with 4 rings and 4 pixels per ring.
The purpose of the gure is to show the position of the elements in a vector and
a matrix with a ring-pixel index. The ring number is shown on the right side
of the vector and the pixel number in each ring is shown inside the vector. For
the matrix, the ring numbers are showed on the outside (above and left) and the
pixel numbers are only shown in the upper left block.
This fact changes for an experiment scanning on rings using a basic scan path
as described in the previous section. When one assumes that the angular distance
between the centre of two adjacent rings is the same and one uses the coadded
rings (when an experiment scans the same ring several times, the coadded ring is
just the average of these scans) as input datavector in the likelihood, it turns out
that some symmetries in both the noise and signal correlation matrices are kept.
When using the coadded rings as datavector d, the noise correlation between
two elements in the vector, d
i
and d
j
is no longer only dependent on the distance
[ij[. But when writing this datavector indexed with pixel number p (or in the
continuous case discussed in the previous section) in ring number r (
E
) as d
pr
(see gure (3.2)) one can see that there still is some symmetry present. The cor-
relation between two rings r and r
will of course only dependent on the distance

between the rings [r r
[. The signal correlation between two elements in the

timestream is only dependent on the distance on the sphere, but because of the
geometrical positioning of the rings on the sphere, also the correlation between
d
pr
and d
p
r
for the signal will depend on [r r
[. This can also easily be seen by

studying gure (3.1). Consider two positions p and p
on the rings r and r
and
the same positions p and p
on the rings R and R
. If the distance between the

two pairs of rings is the same [r r
[ = [RR
[ then the angular distance on the

sphere between the points p and p
on these two dierent pairs of rings is the same.

1
2
3
4
1
2
4
1
2
4
3
3
1
3
1
2
3
4
4
2
Figure 3.3: An example of an experiment scanning on 4 rings with 4 sample
points (shown as black dots) on each ring. The big numbers indicate the rings
numbers r and the smaller numbers indicate the number of the sample points p
in each ring.
The fact that the correlation < d
pr
d
p
r
> between two elements in the ringset
is only dependent on the distance between rings [r r
[ both for the signal and

noise simplies the structure of the correlation matrix. The correlation matrix
will be block circulant, meaning that each row of blocks will be equal to the pre-
vious row of blocks shifted one block to the right. In addition a block circulant
matrix is block symmetric meaning that the matrix is symmetric when the blocks
are considered as the elements of the matrix. This is illustrated in gure (3.3)
and (3.4). In this example, one has 4 rings with 4 sample points on each (g-
ure (3.3)). The 4 4 correlation matrix between points p and p
for ring 1 is
called A. One has A
pp
=< d
1p
d
1p
>. The correlation matrix between a point
p on ring 1 and a point p
on ring 2 is B
pp
=< d
1p
d
2p
>. In the same way
the correlations between rings 1 and 3 is given by the matrix C. Obviously the
correlations between rings 1 and 4 will be the same as for rings 1 and 2 as the
absolute value of the distance between the rings [r r
[ is the same. The full

16 16 correlation matrix < d
pr
d
p
r
> for this case is shown in gure (3.4) (see
also gure (3.2)). One clearly sees the block circulant structure. It has to be
noted that this example has very few rings and is therefore unrealistic. With this
small number of rings, the approximation that the noise correlations are depen-
dent on the absolute value of the distance between rings breaks down. This is
because the correlations between the rst and the last ring cannot be the same
as between the rst and the second. The noise is not circulant. But as mention
in section (2.2.1) this is a good approximation when the number of rings is large
as in a real CMB experiment.
The block circulant structure of the correlation matrix is what makes the
likelihood evaluation fast. A block circulant matrix is block diagonal in Fourier
space and one can therefore nd the determinant and inverse faster. This will be
formalized in the next section.
ABCB
BABC
CBAB
BCBA
Figure 3.4: The correlation matrix < d
pr
d
p
r
> for the ringset shown in gure
(3.3). Here A is the 4 4 correlation matrix for points p and p
within the same

ring A
pp
=< d
pr
d
p
r
>, B is the 4 4 correlation matrix between a point p on
ring r and point p
on ring r 1. Finally C is the correlation matrix between a

point p on ring r and point p
on ring r 2.
3.2.1 Theory
As a generalization of the previous example, one can consider an experiment
which has N
r
rings, N
R
pixels per ring and scans each ring N
c
times. To go from
the time stream d
n
with N
TOD
= N
r
N
R
N
c
elements to the coadded ringset d
pr
with N
r
N
R
elements one has to average the N
c
scans of each ring. This can be
written as
d
pr
=
N
TOD
1
n=0
A
pr n
d
n
. (3.15)
Here the coadding matrix A
pr n
selects which elements of the whole timestream
that belongs to ring r and pixel p. The index n can be written in terms of three
indices x, y and z as n = N
R
N
c
x + N
R
y + z. Here x is the ring number going
from 0 to N
r
1, y is the scan number for a given ring going from 0 to N
c
1
and z is the pixel number in the given ring being in the interval [0, N
R
1]. With
this indexing equation (3.15) can be written as
d
pr
=
Nr1
x=0
Nc1
y=0
N
R
1
z=0
A
pr xyz
d
xyz
. (3.16)
It is now obvious that the co-adding matrix can be written as
A
pr xyz
=
1
N
c
rx
zp
, (3.17)
where the prefactor makes the division with the number of scans per ring after
the measurements from all of the scans have been added together. Using this
expression equation (3.16) can be written
d
pr
=
1
N
c
Nc1
y=0
d
ryp
. (3.18)
In the Liklihood estimation, the signal and noise correlation matrices will be
needed in Fourier space. Because of this and since the signal correlation matrix is
simpler to calculate directly in Fourier space, the full expressions will be derived
directly in this space. However in pixel space it is easier to see how the symme-
tries work to give the simple form of the correlation matrices. For this reason I
will rst outline the structure of the correlation matrices in pixel space.
The CMB temperature on pixel p in ring r can be written as d
pr
= T(, )
where and are polar coordinates in a coordinate system in which C (see gure
(3.1)) is the north pole. For a given pixel and ring pr these angles can be written
as
pr
=
E
+
p
and
pr
=
r
+
p
. Here
E
is as before the polar coordinate
of the center of the rings and
p
is the oset from the center depending on
the pixel p. The coordinate is clearly independent of the ring number. The
azimuthal coordinate depends on the angle
r
which is the azimuthal position of
the centre of ring r and
p
which is the oset depending only on the pixel p in
the ring. One can now write
C
S
pr,p
r
= < d
S
pr
d
S
p
r
>, (3.19)
= < T(
p
,
pr
)T(
p
,
p
r
) >, (3.20)
=
< a
m
a
m
> Y
m
(
p
,
pr
)Y
m
(
p
,
p
r
), (3.21)
=
2 + 1
4
P
(cos ), (3.22)
where = cos
p
cos
p
+ sin
p
sin
p
cos(
r

r
+
p

p
) is the angle
between the two points pr and p
. The superscript S means signal. Clearly

and thereby C
S
pr,p
r
is a function f([rr
[, p, p
) of the distance between the rings.

The matrix C
S
pr,p
r
is block circulant with blocks of size N
R
N
R
. To show the
same for the noise correlation matrix one can use that < d
n
d
n
>= ([n n
[).
One then gets
C
N
pr,p
r
= < d
N
pr
d
N
p
r
>
=
1
N
2
c
yy
< d
N
ryp
d
N
r
p
>
=
1
N
2
c
yy
([N
R
N
c
(r r
) +N
R
(y y
) +p p
[) (3.23)
=
1
N
2
c
yy
([r r
[, p, p
),
where the superscript N means noise. So the correlation matrix C
pr,p
r
is block
circulant both for signal and noise.
It is well known that the Fourier transform of a circulant matrix is diagonal.
I will now show this generally. By the Fourier transform of a N N matrix A I
mean
A
= F
AF, (3.24)
where F is an N N matrix with elements
F
jk
=
1
N
e
2ijk/N
, (3.25)
and the dagger means Hermitian conjugate. The elements of the Fourier trans-
formed matrix can be written
A
kk
=
1
N
N1
j=0
N1
=0
e
2ijk/N
e
2ij
/N
A
jj
, (3.26)
=
1
N
jj
e
2i(kjk
)/N
A
jj
,
where the last sum has the same limits as the upper sum. For a circulant matrix
A
jj
= A
|jj
|
and one has
A
kk
=
1
N
N1
j=0
_ j
=0
e
2i(kjk
)/N
A
|jj
|
(3.27)
+
N1
=j+1
e
2i(kjk
)/N
A
|jj
|
_
,
=
1
N
N1
j=0
_ j
=0
e
2ij(kk
)/N
A
e
2ik
+
j(N1)
=1
e
2ij(kk
)/N
A
e
2ik
/N
_
,
where the substitution = j j
has been made. Now setting
= + N in
the last sum this can be written as
A
kk
=
1
N
N1
j=0
_ j
=0
e
2ij(kk
)/N
A
e
2ik
(3.28)
+
N1
=j+1
e
2ij(kk
)/N
A
(+N)
e
2ik
_
.
Finally dening A
(+N)
= A
one can write

A
kk
=
1
N
N1
=0
A
e
2ik
N1
j=0
e
2ij(kk
)/N
. .
N
kk
(3.29)
=
kk
N1
=0
A
e
2ik/N
,
which shows that the Fourier transform of a circulant matrix is diagonal. In the
same way the Fourier transform of a block circulant matrix is block diagonal.
Hence the Fourier transformed of the correlation matrix C
pr,p
r
is block diagonal
and the determinant and inverse can be found fast. This was shown in pixel space
where the reason can be easily understood physically. Now the derivation of the
full expressions which will be needed for likelihood estimation will be shown.
Equation (3.8) denes the 2D Fourier transform of d
pr
( = 0). In discrete
form the equation can be written,
d
mm
=
1
N
r
N
R
Nr
r=0
N
R
p=0
e
2ipm/N
R
e
2irm
/Nr
d
pr
. (3.30)
For the signal correlation matrix, expression (3.12) can be used by replacing
s
m
= a
m
. One then gets for the correlation matrix in Fourier space C
S
mm
,MM
C
S
mm
,MM
= < d
S
mm
d
S
MM
> (3.31)
=
< a
m
a
M
> d
mm
(
E
)d
MM
(
E
)X
m
()X
M
() (3.32)
=
mM
mm
(
E
)d
MM
(
E
)X
m
()X
M
(). (3.33)
The rst delta function shows that the signal correlation matrix is block diagonal
in Fourier space. The diagonal blocks can be calculated as a sum over the power
spectrum, rotation matrices which can be calculated by recursion and precom-
puted X
m
factors.
The noise matrix is a bit more complicated. As discussed in section (2.2.1),
the noise can be described by a power spectrum P(k) given as P(k) =< d
k
d
k
>
where d
k
are the 1D Fourier coecients of the noise,
d
N
k
=
n
d
N
n
e
2ikn/N
, (3.34)
where N = N
TOD
. The reverse transformation is
d
N
n
=
1
N
k
d
N
k
e
2ikn/N
, (3.35)
which gives
< d
N
n
d
N
n
> =
1
N
2
kk
< d
N
k
d
N
k
> e
2i(knk
)/N
, (3.36)
=
1
N
k
P(k)e
2ik(nn
)/N
, (3.37)
which follows from the fact that the noise is (assumed to be) circulant. The
correlation C
N
pr,p
r
=< d
N
pr
d
N
p
r
> is easiest to calculate in pixel space. For this
reason a Fourier transform will be applied to C
pr,p
r
to get the correlation matrix
in Fourier space C
N
mm
,MM
. Using equation (3.18) one has that
< d
pr
d
p
r
>=
1
N
2
c
yy
< d
ryp
d
r
p
> . (3.38)
Using equation (3.37) one can write this as
C
N
pr,p
r
=
1
N
2
c
N1
k=0
P(k)
Nc1
y,y
=0
e
2ik/N(N
R
Nc(rr
)+N
R
(yy
)+(pp
))
(3.39)
C
N
(r r
, p p
). (3.40)
The Fourier transform is found by the use of equation (3.30),
C
N
mm
,MM
=
1
N
2
r
N
2
c
N
R
1
p,p
=0
Nr1
rr
=0
C
N
(r r
, p p
) (3.41)
e
2i(rmr
M)/Nr
e
2i(pm
)/N
R
=
1
N
2
r
N
2
c
N
R
1
p,p
=0
Nr1
r=0
r(Nr1)
=r
C(, p p
)e
2ir/Nr(mM)
(3.42)
e
2iM/Nr
e
2i(pm
)/N
R
=
1
N
2
c
N
r
mM
N
R
1
p,p
=0
e
2i(m
pM
)/N
R
(3.43)
Nr1
=0
e
2im/Nr
C(, p p
),
where the last transition follows from the fact that C(r r
+ mN
r
, p p
) =
C(r r
, p p
) where m is an integer.
Before embarking on the likelihood estimation, one practical issue has to be
mentioned. When only parts of the sky are observed one does not have enough
data to estimate the whole power spectrum multipole by multipole. Instead the
power spectrum is estimated in N
b
bins of size
b
such that C
= C
B
b
when
b
<
b+1
, where
b+1
=
b
+
b
.
The likelihood can be written in the form given in equation (2.26)
L(C
B
b
) =
e
1
2
d
C
1
(C
B
b
)d
_
2 det C(C
B
b
)
. (3.44)
The data vector d here is the Fourier transformed timestream with elements d
mm
and the correlation matrix is the sum of the signal covariance matrix C
S
mm
,MM
in equation (3.33) and the noise correlation matrix C

N
mm
,MM
in equation (3.43).
If N is the number of elements in d (NOT the same N as above!!!) which is
N = N
r
N
R
then because of the block diagonal structure of C
mm
,MM
the corre-
lation matrix can be inverted in N
3
R
N
r
operations instead of N
3
. If the number
of rings is equal to the number of pixel per ring N
r
= N
R
then this means that
the inversion of the correlation matrix takes N
2
instead of N
3
. This is also the
scaling of the determinant evaluation. This is easily seen: The inverse of a block
diagonal matrix is just a block diagonal matrix containing the inverted blocks.
To invert each N
R
N
R
block takes N
3
R
operations and there are N
r
blocks to
invert. The determinant of a block diagonal matrix is the product of the deter-
minant of the individual blocks. Finding the determinant of a N
R
N
R
matrix
takes N
3
R
operations and there are N
r
blocks giving the same scaling to inversion
and determinant evaluation.
The inversion and determinant evaluation are fast but one has to check the
operation count for the calculation of the correlation matrices themselves. From
expression (3.33) the calculation of the signal correlation matrix is just a sum over
N
b
bins for each of the N
2
R
N
r
elements of the block diagonal matrix. Also the cal-
culation of the rotation matrices d
mm
are fast using recursion or precalculation.
The noise matrix can be calculated using 3 FFTs (see equation (3.43)) scaling as
(N
R
log N
R
)
2
N
r
log N
r
. The precalculation of the function C
N
(r r
, pp
) takes
N log N operations for the FFT in equation (3.39) times N
r
N
R
for each element.
To quickly nd the maximum of the likelihood equation (3.44) with respect
to the binned power spectrum C
B
b
one needs to nd the derivative. In practice
the log-likelihood
L(C
B
b
) = 2 log L(C
B
b
) = d
C
1
(C
B
b
)d + log det C(C
N
b
). (3.45)
is minimized and its derivative is given by equation (2.44). The operations in-
volved in the derivative calculations are fast. The rst term only needs multipli-
cations of blocks diagonal matrices with vectors which takes N
2
R
N
r
operations.
The last term involves nding the trace of a matrix product of two block diagonal
matrices also taking N
2
R
N
r
operations.
A nal issue involves nding the error bars on the estimate. According to
section (2.3.1), the error bars can be found by taking the inverse of the Fisher
matrix. Unfortunately the evaluation of the Fisher matrix scales almost as N
3
.
This is seen from equation (2.47). Multiplication of the block diagonal matrices
takes O(N
3
R
N
r
) operations. These matrix multiplications have to be done for
each of the N
2
b
elements in the Fisher matrix. If N
R
= N
r
then the operation
count is O(N
2
N
2
b
) which could be almost N
3
for many bins. Another approach
would be to explore the likelihood surface around the maximum. By nding
where the liklihood has fallen by a certain ratio compared to its value at the
maximum, approximate condence regions can be found. However each likelihood
evaluation takes N
2
operations and this has to be done for each bin giving N
2
N
b
operations or N
5/2
for many bins. To nd the full covariance of the estimate
this way would scale as N
2
N
2
b
. A faster way would be to approximate the sher
matrix (which is the average of the second derivative of the log-likelihood at
the maximum) by the numerical second derivative at the likelihood maximum.
Evaluating the gradient of the likelihood takes only O(N
2
) operations. The i-
th row of the covariance matrix of the estimates can be found by taking the
numerical derivative of the gradient with respect to the i-th bin which then also
takes O(N
2
) operations. Finding the whole covariance matrix of the estimates
then takes O(N
2
N
b
) operations.
3.2.2 An Example
Figure 3.5: The noise power spectrum used in the simulation. The vertical line
shows the knee frequency. The frequency k is shown in units of the Nyquist
frequency N
TOD
/2.
To test the above method for power spectrum estimation I simulated an ex-
ample where N
r
= N
R
= 243 meaning that there are 243 scanning rings with 243
pixels in each. Each ring is only scanned once N
c
= 1 so that the total number
of pixels N
TOD
= 59049, a similar number of pixels to the BOOMERANG ex-
periment. The opening angle
E
= 5
and the radius of each ring = 5
. All
the rings intersect at C (gure (3.1)). For simplicity a symmetric Gaussian beam
with 18
FWHM was used. This is however not necessary as the method can deal
with any beam pattern. The noise power spectrum was of the form (2.8) with
= 1. Writing P(k) in terms of wavenumber k one has
P(k) =
2
_
1 +
k
knee
k
_
, (3.46)
with = 9K and k
knee
= 4.3 10
3
in units of the Nyquist frequency N
TOD
/2.
The power spectrum is showed in gure (3.5). With this noise model, the signal
to noise ratio was one at a multipole of about = 600.
The simulation was carried out with the following steps:
1. First a standard CDM power spectrum was selected and a timestream was
simulated using the Wandelt-G orski method described in section (3.1). A
set of a
m
s were created using a random number generator and the CDM
power spectrum up to a multipole L = 1024. Then with equation (3.12)
the Fourier transformed time stream was created and nally equation (3.14)
(FFT) was used to create the time stream in pixel space. The ringset is
shown in gure (3.6).
Figure 3.6: The ringset from a scan on a simulated sky without noise. On the plot
a line from the bottom to the top describes one ring. In this way the rings are
put next to each other from the left to the right. The central point C where all
rings intersect corresponds to the horizontal line of constant value in the middle
of the plot.
In the gure, each ring goes from the bottom to the top along the axis.
The rings are put next to each other from the left to the right along the
E
axis. At = each ring intersects the central point C and therefore this
value is the same for all the rings giving the horizontal line of constant value
in the middle. The lines of constant value going from upper left to lower
right are the other points of intersection between rings which are close to
each other.
2. Noise was generated using the power spectrum above. In Fourier space
Gaussian random numbers were created with the power spectrum P(k).
The noisestream was then Fourier transformed to pixel space (with FFT).
The noise ringset is shown in gure (3.7).
Figure 3.7: Same as gure (3.6) for the noise.
3. The signal and noise ringsets were added together to give a data set from an
experiment scanning on rings. The ringset is shown in gure (3.8). This is
the input data from which the power spectrum was to be estimated. From
the gure the vertical striping caused by the noise which was discussed in
section (2.2.1) is clearly visible.
Figure 3.8: Same as gure (3.6) for signal plus noise. This is the result of adding
the ringsets in gure (3.6) and (3.7)
4. Now the simulated data set was used for likelihood estimation. A conju-
gate gradient solver using only rst derivative to nd the minimum of the
log-likelihood was used (Press, Teukolsky, Vetterling, and Flannery 1992).
Since the solver can easier handle a solution in which all the parameters
have roughly the same order of magnitude, the following change of variables
was made:
C
=
D
b
e
2
b
(+1)
( + 1)
, (3.47)
where D
b
is the binned power spectrum (b is bin number) for which the
log-likelihood was minimized. The Gaussian beam has =
b
. For this
minimization 20 bins with 50 multipoles in each were used. The starting
guess for the minimizer was D
b
= 10
4
(K)
2
for all b = [0, 19]. For this test
the correlation matrix was calculated in pixel space (equation 3.22) and
then Fourier transformed using FFT. With the binning, equation (3.22)
takes the form
C
S
pr,p
r
=
b
D
b
b
e
2
b
(+1)
( + 1)
2 + 1
4
P
(cos ), (3.48)
and its derivative is
C
S
pr,p
D
b
=
b
e
2
b
(+1)
( + 1)
2 + 1
4
P
(cos ), (3.49)
where the sum over b is the sum over all multipoles in the bin b. The
angles between all possible pairs of points pr and p
were precalculated.
5. After about 30 likelihood and derivative evaluations taking about 2 hours
each on a single processor on a 500MHz DEC Alpha Work Station the
minimum was found. If the initial guess was better (it could have been
taken from a faster approximate power spectrum estimation algorithm as
discussed in section (2.3)), the number of likelihood evaluations could have
been signicantly reduced. The result is shown in gure (3.9). The solid
line is the input average power spectrum and the crosses are the estimated
bins. The crosses are plotted in the middle of each bin.
Figure 3.9: The gure shows the result of the likelihood maximisation using the
Fourier transformed ringset as input data. The solid line shows the input average
power spectrum and the crosses show the estimated bins with error bars. The
error bars are approximate asymmetric 2 condence regions showing where the
log-likelihood drops by 2 compared to the maximum.
6. Finally the error bars were found. First the 2 (95%) condence regions
were found by nding where the log-likelihood had been reduces by 2 com-
pared to the maximum. These asymmetric error bars are plotted on the
estimates in gure (3.9). Then these error bars were compared to the 2
error bars found from approximating the Fisher matrix by taking the numer-
ical second derivative of the log-likelihood. These error bars were consistent
with the ones found from the likelihood contours.
3.2.3 Discussion
The power spectrum estimation method described here is the rst exact method
which takes into account all the following eects:
Correlated noise As discussed in previous sections, most CMB experi-
ment have correlated noise. This can usually be seen in the CMB maps
as stripes. Taking noise correlations into account when estimating the
power spectrum is usually hard and involves time consuming calculations
or Monte Carlo simulations. When operating on the timestream instead of
the sky map, the noise correlations have a much simpler structure and are
for this reason easily incorporated into the power spectrum estimation by
the method described here.
Non-uniform distribution of integration time Most CMB sky exper-
iments do not observe each point on the sky the same amount of time. For
this reason the noise for each point on a sky map will be dierent. On the
timeline integration time per pixel is by denition constant.
Beam distortions and far side lobes Normally the beam of the tele-
scope is approximated to be azimuthally symmetric and Gaussian. This
is however not the case for all CMB experiments. Also diraction of the
microwave radiation around the telescope and other apparatus used for the
CMB experiment usually leads to far side lobes.
Partial sky coverage Normally only parts of the sky are observed. This
leads to coupling between the power spectrum coecients C
on the cut
sky. These couplings have to be taken into account when attempting to
estimate the full sky power spectrum from partial sky data.
Pixelization eects When the CMB data are mapped before the power
spectrum estimation is performed, there is an additional smoothing of the
CMB uctuations on small scales due to the nite sized pixels in the map.
On the method presented here, the CMB data is not mapped and hence
this eect is not present.
The MAP mission is design in such a way that the noise correlations are kept
at a minimum. For this case there already exist fast methods for power spectrum
estimation (Oh, Spergel, and Hinshaw 1999; Wandelt, Hivon, and G orski 2000).
These methods take into account most of the above mentions eect. They do
however not take into account a non-symmetric beam shape and far side lobes. If
these issues appear to be important then the method presented here is important
for exact estimation of the power spectrum.
For the MAP mission and possibly for the Planck mission, the spin axis will
be precessing. In this case the centre of the scanning rings will not have the
same
E
. One more rotation has to be added to equation (3.5) describing this
precession. For a precessing experiment where the amplitude of the precessions
(variations in
E
) is given by the angle
P
, the formula for the Fourier transform
of the signal takes the following form
d
S
mm
m
=
a
m
d
mm
(
E
)d
m
(
P
)X
m
. (3.50)
Obviously the correlation matrix < d
mm
m
d
MM
M
> is still block diagonal but
the blocks are now larger.
Another problem with the full sky experiment is that parts of the sky (around
the galactic plane) has to be cut out due to foreground contamination. This is
equivalent to cutting out parts of the rings which destroys the symmetries making
the fast likelihood evaluation possible.
The power spectrum estimation method on rings, the way it is described
here is clearly for an ideal experiment where there are no foregrounds and no
irregularities in the scanning strategy. In reality however this will often not
be the case. To deal with these problems the method could be extended using
techniques similar to those described in (Oh, Spergel, and Hinshaw 1999) and
outlined in section (2.3.1). The idea is that if one knows the exact solution in the
ideal situation then the case with cut rings and imperfect scanning strategy can
be considered as perturbations around the ideal case. In this way one can nd
the product C
1
d without having to invert C. As described in section (2.3.1)
this can be done by preconditioned conjugate gradient techniques. By using
the approximate C
1
by the methods described here one can iterate to nd the
correct C
1
d without having to invert the full matrix C. This is delayed to future
work.
Chapter 4
Gabor Transforms on the Sphere
and Application to CMB
Analysis
In this and the next chapter I will present work done under supervision of Kris
G orski. In this chapter the formalism of windowed Fourier transforms known as
Gabor transforms is extended from the one dimesnional line to the sphere. The
Gabor coecients on the CMB sky will be studied and it is shown how these
represent a compression of CMB data which allow for fast likelihood estimation.
Gabor transforms, or windowed Fourier transforms are just Fourier trans-
forms where the function f(x) to be Fourier transformed is multiplied with a
Gabor Window W(x) (Gabor 1946). In the discrete case f(x
i
) can be a data set.
If parts of the data set are of poor quality or are missing, this can be represented
as W(x
i
)f(x
i
) where the window W is zero where there are missing parts. The
window can also be formed so that it smooths the edges close to the missing parts
and in this way avoid ringing in the Fourier spectrum.
In the rst part of this chapter I will study the eect of Gabor transforms on
the sphere in particular for the CMB sky. The Gabor transform in this context is
just the multiplication of the CMB sky with a window function before using the
spherical harmonic transform to get the pseudo power spectrum. The window can
be a tophat to take out certain parts of the sky for limited sky coverage. Another
window can be a Gaussian Gabor window for smoothing the transition between
the observed and unobserved area of the sky. The Gabor window can also be
designed in such a way as to reduce signal-to-noise by giving pixels with high
signal-to-noise higher signicance in the analysis. In the rst part I will focus on
two types of Gabor windows, the Gaussian window and the tophat window. The
eect of these two windows will be compared. For a given patch on the sphere,
the eect of multiplying this patch with a Gaussian to smooth the edges instead
96
of keeping it as a tophat with sharp edges will be discussed. I will show how the
Gaussian window cuts of long range correlations between the power spectrum
coecients on the cut sphere (pseudo power spectrum) at the cost of increased
short range correlations. The use of the windowed Fourier transform was already
studied in (Hobson and Magueijo 1996) in the at sky approximation. I show
that some of their qualitative results are also valid on the sphere.
In (Hivon et al. 2001) the pseudo power spectrum was used as a quadratic
estimator for CMB power spectrum analysis. In the second part of the chapter
I will discuss the use of the pseudo power spectrum as a compression of a CMB
data set to be used in likelihood estimation. This was already done for a tophat
window in (Wandelt, Hivon, and G orski 2000) under the assumption that the
pseudo spectrum coecients are not coupled so that the correlation matrix for
the likelihood is diagonal. This is only a good assumption when a large part of
the sky is covered. As I will show here, the couplings between the pseudo power
spectrum coecients can be quite large depending on the size of the observed
area. For this reason the full covariance matrix will be used. Analytical and
recursive techniques will be developed which allows one to do fast calculation of
the full covariance matrix.
The distribution of the pseudo power spectrum coecients will be assumed
Gaussian. Using a Gaussian likelihood ansatz the full sky power spectrum will
be estimated from the noisy pseudo power spectrum. The likelihood function
can in this case easily be calculated as the datavector and thereby the correla-
tion matrix are small (only having a few hundres of elements even if the original
whole data set had millions of pixels). The likelihood evaluation takes of the
order (N
bin
)
2
(N
in
)
2
where N
bin
is the number of C
bins estimated and N

in
is
the number of pseudo spectrum coecients used as input to the likelihood. The
most time consuming part is the precomputation of some factors which scales as
about l
3
max
for the signal part. For the noise correlation matrix this precalcula-
tion takes of the order
_
N
pix
l
max
(N
in
)
2
for axissymmetric noise or up to about
_
N
pix
l
2
max
(N
in
)
2
for general noise. Typically N
in
l
max
with a prefactor which
is dependent on the size of the window. For a 18
degree radius with Gaussian

apodisation which will be used in some examples here N
in
0.1l
max
. Calcula-
tions of noise prefactors can also be done by Monte-Carlo. It then takes of the
order N
3/2
pix
N
sim
, where N
sim
is the number of Monte Carlo simulations needed,
which is typically a few thousand.
The method is developed for uncorrelated noise on an axissymmetric patch on
the sky. In the last part of the chapter extensions to non-symmetric patches and
correlated noise will be worked out and further developments will be discussed.
4.1 The Gabor Transformation and the Tem-
perature Power Spectrum
In this section I will rst describe the Gabor transform for functions on a one
dimensional line. Then I extend the formalism to functions on the sphere and
the properties of the Gabor transform coecients on the sphere are discussed.
4.1.1 The One Dimensional Gabor Transform
For a data set d
j
with N elements, the normal Fourier transform is dened as,
d
k
=
j
d
j
e
i2jk/N
. (4.1)
A tilde on

d shows that these are the Fourier coecients. The inverse transform
is then,
d
j
=
1
N
d
k
e
i2jk/N
. (4.2)
Sometimes it is useful to study the spectrum of just a part of the data set. This
could be if parts of it are of poor quality or the spectrum is changing along the
data set. In this case, one can multiply the data set with a function, removing
the unwanted parts and taking out a segment to be studied. The function can be
a step function cutting out the segment to study with sharp edges or a function
which smooths the edges of the segment to avoid ringing (typically a Gaussian).
The Fourier transform with such a multiplication was studied by Gabor (Ga-
bor 1946) and is called the Gabor Transform. It is dened for a segment centered
at j = M and with wavenumber k as,
d
kM
=
j
d
j
G
jM
e
i2jk/N
. (4.3)
Here G
jM
is the Gabor Window, the function to multiply the data set with. The
transform is similar to the Wavelet transform. The dierence is that the window
function in the Wavelet transform is frequency dependent so that the size of the
segment is changing with frequency.
Analogously to the Fourier transform, there is also an inverse Gabor trans-
form. To recover the whole data set from a Gabor transform, one needs the
Fourier coecients taken with dierent windows G
jM
being centered at several
dierent points M. This means that the data set has to be split up into several
segments. The center of each segment is set to M = mK where K determines
the density of segments and m is an integer specifying the segment number. One
then has for the inverse transform
d
j
=
d
km
g
km
. (4.4)
Due to the non-orthogonality of the Gabor transform, the dual Gabor window g
km
is not trivial to nd, but several techniques have been developed for calculating
this dual window (e.g. (Strohmer 1997) and references therein).
In this chapter I will study the Gabor transform on the sphere and apply it
to CMB analysis. I will take out a disc on the CMB sky, using either tophat
or Gaussian apodisation and then create the pseudo power spectrum

C
on this
apodised sky. These

C
will be used for likelihood estimation of the underlying

full sky power spectrum. I also outline how several discs (segments) centered at
dierent points can be combined to yield the full sky power spectrum.
4.1.2 Gabor Transform on the Sphere
I start by dening the

C
for a Gabor Window G( n) as,
m
a
m
a
m
2 + 1
, (4.5)
where
a
m
=
_
d nT( n)G( n)Y
m
( n). (4.6)
Here T( n) is the observed temperature in the direction of the unit vector n,
Y
m
( n) is the spherical Harmonic function and G( n) is the Gabor window.
I will here use a Gabor Window which is azimuthally symmetric about a point
n
0
on the sphere, so that the window is only a function of the angular distance
from this point on the sphere cos = n n
0
. Then one can write the Legendre
expansion of the window as,
G() =
2 + 1
4
g
(cos ) =
m
g
Y
m
( n)Y
m
( n
0
). (4.7)
One can also write,
T( n) =
m
a
m
Y
m
( n). (4.8)
Inserting these two expressions in equation (4.6) one gets
a
m
=
m
(n
0
)
_
Y
m
( n)Y
m
( n)Y
m
( n)d n (4.9)
=
m
( n
0
)
(2 + 1)(2
+ 1)(2
+ 1)
4
(4.10)
m m
__

0 0 0
_
(1)
m
, (4.11)
where relation (C.3) for Wigner 3j Symbols were used. Using this expression,
the relation < a
m
a
m
>= C

mm
and the orthogonality of Wigner symbols
(equation (C.1)), one can write <

C
> as,
<

C
>=
K(,
). (4.12)
With C
I will always mean < C
> when I am referring to the full sky C
. In
this expression, K(,
) is the Gabor Kernel,

K(,
) = (2
+ 1)
g
2
1
4
[Y
2
m
( n
0
)[
2
_

0 0 0
_
2
= (2
+ 1)
g
2
(2
+ 1)
(4)
2
_

0 0 0
_
2
(4.13)
The Legendre coecients g
, are found by the inverse Legendre transformation,

g
= 2
_
=
C
=0
G()P
(cos )d cos , (4.14)

where
C
is the cut-o angle where the window goes to zero. One sees from the
expression for the kernel, that there is no dependency on n
0
. This means that
<

C
> is the same, independent on where the Gabor window is centered.

In Figure (4.1) I have plotted the kernel for a Gaussian Gabor window,
G() = e
2
/(2
2
)

C
, (4.15)
G() = 0 >
C
, (4.16)
with 5 and 15 degrees FWHM (corresponding to = 2.12
and = 6.38
) and
C
= 3. One sees that the kernel is centred about =
, and falls o rapidly.

Figure (4.2) shows the same for the corresponding tophat Gabor windows,
G() = 1
C
, (4.17)
G() = 0 >
C
. (4.18)
The tophat windows are covering the same area on the sky as the corresponding
Gaussian windows in gure (4.1) (
C
is the same). Ones sees clearly that the di-
agonal is broader for the smaller windows indicating stronger couplings. Another
thing to notice is that whereas the kernel for the tophat Gabor window only
falls by about 4 orders of magnitude from the diagonal to the far o-diagonal
elements, the Gaussian Gabor kernel falls by about 8 orders of magnitude (the
vertical axis on the four plots are the same). The smooth cut-o of the Gaussian
Gabor Window cuts of long range correlations in spherical harmonic space. One
of the aims of the rst part of this paper is to see how the pseudo power spectrum
of a given disc on the sky (tophat window) is aected by the multiplication with
a Gaussian Gabor window. For this reason the pseudo spectrum will be studied
for a tophat and a Gaussian always covering the same area on the sky.
Figure 4.1: The logarithm of the kernel K(,
) describing the connection between

the spherical harmonic coecients C
on the full sky and the corresponding co-

ecients

C
on the apodised sky via the relation

C
K(,
)C
. The gure
shows the kernel for a 5
and 15
FWHM Gaussian Gabor window with

C
= 3
(left and right respectively).
Figure 4.2: Same as gure (4.1) for tophat Gabor windows covering the same
area on the sky as the Gaussian windows.
In gure (4.3), I have plotted cuts through the kernel at = 200 and = 500
for the 5 and 15 degree FWHM Gaussian Gabor windows (dashed line). The solid
line is the corresponding kernel (same area on the sky) when using the tophat
Gabor window. One sees that the Gaussian window eectively cuts of long range
correlations whereas the tophat window is narrower close to the diagonal. The
Gaussian window has larger short range correlations.
Figure 4.3: The gures show cuts through the kernel K(,
) connecting the full

sky and cut sky spherical harmonic coecients. The full kernels are shown in
gures (4.1) and (4.2). The cuts are taken at = 200 and = 500 for the 5 (upper
plot) and 15 (lower plot) degree Gaussian Gabor window (dashed line). The solid
line is for the corresponding (same area on the sky) tophat window. The kernels
are here normalised so that the peak at the given cut has its maximum at 1. In
this way one can easier compare the kernels.
Figure (4.4) shows how the width of the kernel gets narrower and the corre-
lations smaller as the Gabor window opens up. The four kernels are shown for
= 500 and the Gaussian windows have 5, 10, 15 and 30 degree FWHM with
C
= 3. The same relation for the corresponding (in the sense described above)
tophat window is also shown. Gaussian functions are plotted on top of the kernels
and show that the kernels are very close to Gaussian functions near the diagonal.
Figure 4.4: The gures show a cut through the kernel K(,

sky and cut sky spherical harmonic coecients. The cuts are taken at = 500
for a 5, 10, 15 and 30 degree FWHM Gaussian Gabor window (solid line) with a
C
= 3 cut-o. The dotted line shows the kernel for a tophat window covering
the same area on the sky. Dashed lines are Gaussian ts to the curves.
In gure (4.5) I have plotted the relation between the FWHM width of
the kernel and the size of the window for Gaussian and tophat windows.
The two curves are very well described by = 220/
FWHM
for the Gaussian
window (
FWHM
in degrees) and = 140/
FWHM
for the corresponding tophat
window. Clearly for a given observed area of the sky, multiplying with a Gaussian
will increase the FWHM of the kernel. This is also what was seen in gure (4.3)
and (4.4).
Figure 4.5: The gure shows the uncertainty relation = consant for a Ga-
bor transform on the sphere. The solid line shows the width of the Gabor
kernel K(,
) connecting the full sky and the cut sky power spectra when ap-
plying a Gaussian Gabor window with a cut
C
= 3. The FWHM is showed on
the lower abcissa. The dashed line shows the width of the kernel for a tophat
window covering the same area of the sky as the Gaussian. The full radius of the
tophat window is shown on the upper abcissa. The curves are well described by
= 220/
FWHM
and = 140/
FWHM
for the Gaussian and tophat windows
respectively.
In Figure (4.6), I show the shapes of the

C
for Gaussian and tophat win-

dows compared to the full sky spectrum. The plots which were made using the
analytical formula (4.12) show

C
for a 5 and 15 degree FWHM Gaussian Gabor

window (solid line) cut at 3. The corresponding spectra for the tophat Gabor
windows are shown as dotted lines. The spectra are normalised in such a way that
they can be compared to the full sky power spectrum (dashed line). For the 5
FWHM window one can still distinguish the three lines. At this window size the
pseudo spectra are very similar to the full sky spectra but with small deviations
depending on the shapes of the kernel and the shape of the power spectrum. In
this case the spectrum for the Gaussian window seems to be smaller at the peaks
and larger at the troughs whereas the spectrum for the tophat window is always
larger. For the 15
FWHM windows the pseudo spectrum using the Gaussian

Gabor window are on top of the full sky power spectrum. For the tophat window
it is still possible to distinguish the pseudo spectrum from the full sky power
spectrum although the lines are still very close. This means that the

C
could
be good estimators of the underlying full sky C
provided that the window is big

enough. This is consistent with the conclusion in (Hobson and Magueijo 1996),
where this was shown in the at sky approximation.
One feature which is very prominent is the additional peak at low for the
Gaussian window. The reason for this peak comes from the fact that the diagonal
in the Gaussian kernel is broader than in the tophat kernel. For the low multipoles
the power spectrum is dropping rapidly because of the Sachs-Wolfe eect and the
lowest multipole C
are much bigger than the C
for higher multipoles. Since the

Gaussian kernel is broad, the

C
at low multipoles will pick up more from the

C
at lower multipoles than the narrower tophat kernel (see gure (4.3). These
low multipole C
have very high values compared to the higher multipole C
and
for that reason the

C
for the Gaussian window will get a higher value. This is

illustrated in gure (4.7) where a cut through the kernel at = 50 is shown for
the 5
FWHM Gaussian Gabor window (solid line) and the corresponding tophat
(dashed line) normalised to one at the peak. The dotted line shows a typical
power spectrum. Clearly the Gaussian kernel will pick up more of the high value
C
at low multipoles.
Figure 4.6: The windowed power spectra

C
for a 5 and 15 degree FWHM Gaus-

sian Gabor Gabor window cut at
C
= 3 (solid line) and for a tophat window
covering the same area on the sky (dotted line). All spectra are normalised in
such a way that they can be compared directly with the full sky spectrum which
is shown on each plot as a dashed line. Only in the rst plot are all three lines
visible. In the three last plots, the full sky spectrum and the Gaussian pseudo
spectrum (dashed and solid line) are only distinguishable in the rst few multi-
poles.
Figure 4.7: The gure show a cut through the kernel K(,

sky and cut sky spherical harmonic coecients. The cut is taken at = 50 for a 5
degree FWHM Gaussian Gabor window (solid line) and a corresponding tophat
window (dashed line). The kernels are normalised to one at the peak. A typical
power spectrum normalised to one at the quadrupole is plotted as a dotted line.
The gure aims at explaining the extra peak in the pseudo power spectrum at
low multipoles for the Gaussian Gabor window shown in gure (4.6).
In gure (4.8) I show the pseudo power spectra for a particular realisation
using a 15 degree FWHM Gaussian window (upper plot) and a tophat window
(lower plot). The pseudo spectra are compared to the average full sky spectra
shown as a dashed line. The dark shaded area shows the expected 1 cosmic and
sample variance on the pseudo spectra taken from the formulae to be developed
in the next sections. The lighter shaded area shows only cosmic variance. Note
that the pseudo spectrum for the Gaussian window is smoother than the pseudo
spectrum for the tophat window. This is again a result of the broader Gaussian
kernel.
Figure 4.8: One realisation of the windowed power spectra. The upper plot shows
a realisation of a pseudo power spectrum using a 15 degree FWHM Gaussian
Gabor window. The pseudo spectrum is normalised in such a way that it can be
compared directly to the full sky spectrum which average is shown as a dotted
line. The lower plot shows the same realisation using a corresponding tophat
window. The ligh shaded area shows 1 cosmic variance around the full sky
average spectrum. The darker area shows 1 cosmic and sampling variance taken
from the theoretical formula.
4.1.3 Rotational Invariance
It was shown that the average <

C
> is invariant under rotations of the Gabor

window. I will now show that the non averaged

C
are rotationally invariant

under any rotation of the sky AND Gabor window by the same angle. This fact
justies that I in the following always put the window on the north pole since
this simplies the calculations. In the following I will use the rotation matrices
D
mm
described in Appendix (A). Consider a rotation of the sky and window by
the angles ( ). Then the a
m
becomes,
a
rot
m
=
_
d n[

D( )T( n)G( n)]Y

m
( n). (4.19)
If one makes the inverse rotation of the integration angle n, one can write this as
a
rot
m
=
_
d nT( n)G( n)[

D
()Y
m
( n)], (4.20)
which is just
a
rot
m
=
m
()
_
d nT( n)G( n)Y

m
( n). (4.21)
One can identify the last integral as the normal a
m
.
a
rot
m
=
m
() a
m
. (4.22)
So the a
m
are NOT rotationally invariant. Rotation mixes m-modes for a given
-value.
Now to the

C
l
. One has that
C
rot
=
1
2 + 1
m
a
rot
m
a
rot*
m
(4.23)
=
1
2 + 1
m
()D
m
() a
m
a
m
(4.24)
=
1
2 + 1
a
m
a
m
D
m
()D
m
(). (4.25)
Using the properties given in Appendix (A), one can write the last D-function
on the last line as,
m
() = D
m
( ) (4.26)
= e
im
m
()e
im
(4.27)
= e
im
mm
()e
im
(4.28)
= D
mm
( ). (4.29)
Knowing that ( ) is the inverse rotation of () one can write,
m
D
m
()D
m
() =
m
D
m
()D
mm
( ) (4.30)
= D
m
(000) =
m
m
(4.31)
So one gets,
C
rot
=

C
. (4.32)
4.2 Likelihood Analysis 110
4.2 Likelihood Analysis
In this section I will show how the pseudo power spectrum can be used as in-
put to a likelihood analysis for estimating the full sky power spectrum from an
observed disc on the sky multiplied with a Gabor window. I will in this section
concentrate on a Gaussian Gabor window, but the formalism is valid for any
azimuthally symmetric Gabor window.
4.2.1 The Form of the Likelihood Function
To know the form of the likelihood function, one needs to know the probability
distribution of

C
. In Figure (4.9) and (4.10) I show the probability distribu-

tion from 10000 simulations with a 5
and 15
FWHM Gaussian Gabor window

respectively. The dashed line shows a Gaussian with mean value and standard
deviation found from the formulae given in the previous and next section. One
can see that the probability distribution is slightly skewed for low , but for high
it seems to be very well approximated by a Gaussian. Also the small window
shows more deviations from a Gaussian than the bigger window. In gure (4.11) I
show the probability distribution from a simulation with a tophat Gabor window
covering the same area on the sky as the 15
FWHM Gaussian window. Also for

this window the probability distribution is close to Gaussian.
Figure 4.9: The probability distribution of

C
taken from 10000 simulations with

a 5
FWHM Gaussian Gabor window truncated at

C
= 3. The variable x is
given as x = (

C
<

C
>)/
_
< (

C
<

C
>)
2
>. The dashed line is a Gaussian
with the theoretical mean and standard deviation of the

C
. The plot shows the
distribution for = 50, = 200, = 500, and = 800. The probabilities are
normalised such that the integral over x is 1.
Figure 4.10: Same as gure (4.9) but for a 15
FWHM Gaussian Gabor window.

Figure 4.11: Same as gure (4.10) but for a tophat Gabor window covering the
same area on the sky.
From the above plots it seems to be reasonable to approximate the likelihood
function with a Gaussian provided the window is big enough and multipoles at
high enough values are used,
L =
e
1
2
d
T
M
1
d
2 det M
. (4.33)
Omitting all constant terms and factors, the log-likelihood can then be writ-
ten:
L = d
T
M
1
d + ln det M. (4.34)
Here d is the datavector which contains the observed

C
for the set of sample

-values
i
. The datavector is taken from the observed windowed sky in the
following way:
d
i
=

C
i
<

C
i
> . (4.35)
The matrix M is the covariance between pseudo-C
l
which elements are given
by:
M
ij
=<

C
j
> <

C
i
><

C
j
> . (4.36)
The next step is to nd an expression for M
ij
rst assuming that no noise is
present.
4.2.2 The Correlation Matrix
To do fast likelihood analysis with

C
one needs to be able to calculate <

C
> and
the correlations <

C
> fast. Calculating the average <

C
> by formula (4.12)

using the analytic expression (4.13) for the kernel is not very fast. It turns out that
a faster way of evaluating the kernel is by using direct integration (summation
on the pixelised sphere) and then, as shown in Appendix (D), recurrence. By
means of an integral, one can then write the a
m
as (now assuming that n
0
is on
the north pole),
a
m
=
_
G( n)Y
m
( n)Y
m
( n)d n
=
_
G()
m
()
m
()d cos
_
e
i(mm
)
d
. .
2
mm
m
2
_
G()
m
()
m
()d cos
m
h(,
, m), (4.37)
where the last line denes h(,
, m) and
m
() is given by,
Y
m
(, ) =
m
()e
im
. (4.38)
Using this form, one gets,
<

C
>=
1
2 + 1
m
h
2
(,
, m). (4.39)
To obtain this expression, n
0
was on the north pole, but as was shown, the
<

C
>s are rotationally invariant, that is <

C
> remains the same if one ro-

tates the Gabor window so that it is centered on the north pole.
When using real CMB data, the observed temperature map is always pixelised.
So an integral over the sphere has to be replaced by a sum over pixels. In this
case, the formula for h(,
, m) has to be replaced by
h(,
, m) =
j
G
j
j
m
j
, (4.40)
where the index j is the pixel number replacing the angle and
j
is the area
of pixel j. Using a pixelisation scheme like HEALPix (G orski et al. ) which has
a structure of azimuthal rings going from the north to the south pole with N
r
pixels in ring r and equal area for each pixel
j
= this can be written as
h(,
, m) =
r
Nr1
p=0
G
r
r
m
m
,
=
r
N
r
G
r
r
m
m
. (4.41)
Here the sum over pixels is split into a sum over rings r and a sum over the pixels
in each ring p. The rst sum goes over all rings which have <
C
.
Using this expression for the a
m
one can now nd the correlation matrix
<

C
>=
mm
< a
m
a
m
a
m
a
m
>
(2 + 1)(2
+ 1)
. (4.42)
In this expression one can use relation (4.37) to get,
<

C
> =
1
(2 + 1)(2
+ 1)
mm
LL
KK
< a
Lm
a
L
m
a
Km
a
K
m
> (4.43)
h(, L, m)h(, L
, m)h(
, K, m
)h(
, K
, m
) (4.44)
=
1
(2 + 1)(2
+ 1)
(4.45)
mm
LL
KK
[< a
Lm
a
L
m
>< a
Km
a
K
m
> (4.46)
+ < a
Lm
a
Km
>< a
L
m
a
K
m
> (4.47)
+ < a
Lm
a
K
m
>< a
L
m
a
Km
>] (4.48)
h(, L, m)h(, L
, m)h(
, K, m
)h(
, K
, m
). (4.49)
Clearly the rst term is just the product <

C
><

C
>, and the two last terms

are equal (using a
Km
= a
Km
(1)
m
and a
K
m
= (1)
m
m
) so one gets,
M
ij
=
2
(2 + 1)(2
+ 1)
m
_
L
C
L
h(
i
, L, m)h(
j
, L, m)
_
2
. (4.50)
This is one of the main results of this thesis since the formula allows one to ana-
lytically calculate the correlation matrix needed for likelihood analysis. Another
main result is the recurrence deduced in appendix (D) which allows fast evalua-
tion of the h(,
, m) functions and thereby this correlation matrix.

In Figure (4.12) one can see the correlation matrix for a typical power spec-
trum with a 15
FWHM Gaussian Gabor window (note that in the gure, the

correlation matrix is normalised with the pseudo power spectrum). The correla-
tion between dierent

C
is clearly small as the diagonal is quite sharp. There is

only the small wall at low multipoles which again comes from the coupling to
the smallest multipoles which have very high values.
Figure 4.12: The gure shows the correlation matrix M
between pseudo spec-

trum coecients normalised with the pseudo spectrum (<

C
T
C
T
> <

C
T
><
C
T
>)/(<

C
T
><

C
T
>) for a 15 degree FWHM Gaussian Gabor window. A

standard CDM power spectrum was used to produce this matrix.
4.2.3 Including White Noise
I now assume that each pixel j has a noise temperature denoted by n
j
, with the
following properties,
< n
j
>= 0, < n
j
n
j
>=
jj

2
j
, (4.51)
where
j
is the noise variance in pixel j. Then one has the following expressions
for the a
m
and C
l
(I use superscript N for noise quantities),
a
N
m
=
j
n
j
Y
j
m
j
(4.52)
< a
N
m
a
N
> =
jj
< n
j
n
j
> Y
j
m
Y
j
j
=
2
j
Y
j
m
Y
j
2
j
(4.53)
< C
N
> =
1
2 + 1
m
< a
N
m
a
N
m
>=
1
4
2
j
2
j
. (4.54)
Here Y
j
m
is the Spherical Harmonic of the pixel centre of pixel j. For the win-
dowed coecients, one gets similarly,
a
N
m
=
j
G
j
n
j
j
Y
j
m
(4.55)
C
N
=
1
4
2
j
G
2
j
2
j
(4.56)
The next step is to nd the noise correlation matrix,
<

C
N
C
N
> =
1
(2 + 1)(2
+ 1)
mm
jj
kk
j

k
k
(4.57)
< n
j
n
j
n
k
n
k
> G
j
G
j
G
k
G
k
Y
j
m
Y
j
m
Y
k
m
Y
k
m
(4.58)
= < C
N
>< C
N
> +M
N
, (4.59)
where M
N
can be written as,

M
N
=
2
(2 + 1)(2
+ 1)
mm
_
_
2
j
G
2
j
2
j
Y
j
m
Y
j
_
_
2
(4.60)
For pixelisation schemes like HEALPix, the expression can be evaluated fast using
FFT. This is apparent when one writes the sum over pixels as a double sum over
rings and pixels per ring.
2
j
G
2
j
2
j
Y
j
m
Y
j
r
m
Nr1
p=0
e
ip(mm
2
G
2
r
2
r,p
. (4.61)
In the case of an axissymmetric noise model, this expression becomes even easier
which is apparent writing this as
r
m
m

2
G
2
r
2
r
Nr1
p=0
e
ip(mm
)
. .
Nr
mm
r
m
m
G
r
2
r
. .
G
r
h
(,
, m).
(4.62)
The sum is equivalent to the previous expression for h(,
, m) (equation (4.41))
with a new window G
r
. This motivates the denition of h(,
, m, m
) such that
M
N
=
2
(2 + 1)(2
+ 1)
mm
h
2
(,
, m, m
), (4.63)
where
h
(,
, m, m
j
G
j
Y
j
m
Y
j
m
, (4.64)
where G
j
= G
2
r
2
r,p
for the discrete case. These function can also be calculated
using the recursion which I deduce in appendix (D).
One can then nd the total correlation matrix, splitting it up into one part
due to signal, one part due to noise and a cross term,
a
m
= a
S
m
+ a
N
m
(4.65)
=
1
2 + 1
m
< a
m
a
m
>=

C
S
+

C
N
+

C
X
(4.66)
<

C
> = <

C
S
> + <

C
N
> (4.67)
C
X
=
1
2 + 1
m
_
a
S
m
a
N
m
+a
N
m
a
S
m
_
, (4.68)
where the assumption that there is no correlation between signal and noise was
used. One can then see that the correlation matrix can be written in a similar
manner,
<

C
j
> <

C
i
><

C
j
>= M
S
ij
+M
N
ij
+ <

C
X
C
X
j
> . (4.69)
This is another major result of this thesis showing the full correlation matrix of
including noise. One can write the cross term as,

<

C
X
C
X
> = 4
mm
< a
S
m
a
S
m
>< a
N
m
a
N
m
>
(2 + 1)(2
+ 1)
(4.70)
= 4
m
< a
S
m
a
S
m
>< a
N
m
a
N
m
>
(2 + 1)(2
+ 1)
, (4.71)
where the relation < a
S
m
a
S
m
>=
mm
< a
S
m
a
S
m
> was used. From the above,
one can see that these two factors can be written as,
< a
S
m
a
S
m
> =
h(,
, m)h(
, m), (4.72)
< a
N
m
a
N
m
> =
i
G
2
i
Y
i
m
Y
i
2
i
2
i
, (4.73)
= h
(,
, m). (4.74)
4.2.4 Likelihood Estimation and Results
The expressions for the covariance matrix and mean psuedo-C
have now been

found. Because of the limited information content in one patch of the sky one
can not estimate the full sky C
for all multipoles . For this reason the full sky

power spectrum has to be estimated in N
bin
bins. Also the algorithm to minimize
the log-likelihood needs the numbers to be estimated to be of roughly the same
order of magnitude. For this reason I estimate for some parameters D
b
which for
bin b is dened as
C
=
D
b
( + 1)
,
b
<
b+1
, (4.75)
where
b
is the rst multipole in bin b. Using this binning, one can calculate the
likelihood signicantly faster by writing the signal correlation matrix equation
(4.50) as
M
S
ij
=
D
b
D
b
(b, b
, i, j), (4.76)
where (b, b
, i, j) is given as,
(b, b
, i, j)
2
(2
i
+ 1)(2
j
+ 1)
m
_
lb
B
2
( + 1)h(,
i
, m)h(,
j
, m)
_
lb
B
2
( + 1)h(,
i
, m)h(,
j
, m)
_
, (4.77)
which is precomputed. The sums over here go over the values in each specic
bin b. One sees that computing the likelihood takes of the order (N
bin
)
2
(N
in
)
2
op-
erations whereas the precomputation of the factor (k, k
, i, j) goes as (N
bin
)
2
(N
in
)
2
N
m
where N
m
is the number of m values used. Note that the multipole coecients
of the beam B
are also included. The reason is that the input data is always
aected by the beam and this is corrected for by using the beam convolved full
sky power spectrum C
B
2
.
The sum over m in the expressions for the covariance matrix and <

C
> can
be limited. The h-functions are rapidly decreasing for increasing m for Gaussian
and tophat windows. For Gaussian Gabor windows it seems that one can cut the
sums over m at 200 to high accuracy. For top-hat windows, the sum should be
extended to m = 400.
The noise cross term can be written in a similar way as
M
X
ij
<

C
X
C
X
j
>=
k
D
b
(b, i, j), (4.78)

where
(b, i, j)
2
(2
i
+ 1)(2
j
+ 1)
m
_
lb
B
2
( + 1)h(,
i
, m)h(,
j
, m)
_
h
(
i
,
j
, m).
(4.79)
In the minimization of the likelihood, one also needs the rst and second
derivative of the log-likelihood with respect to the bin values D
b
. These can be
found to be,
L
D
b
= Tr(A
b
) +f
T
h
b
+ 2
d
T
D
b
f (4.80)
2
L
D
b
D
b
= Tr(A
b
A
b
) +Tr(C
1

2
C
D
b
D
b
) + 2h
T
b
C
1
h
b
(4.81)
2h
T
b
g
b
2h
T
b
g
b
f
T

2
C
D
b
D
b
f + 2f
T
k
bb
+ 2
d
T
D
b
g
b
(4.82)
I have used the following denitions,
A
b
= C
1
C
D
b
(4.83)
h
b
=
C
D
b
C
1
d (4.84)
g
b
= C
1
d
D
b
(4.85)
f = C
1
d (4.86)
k
bb
=

2
d
D
b
D
b
(4.87)
Here the derivatives of d are,
d
i
D
b
=
<

C
i
>
D
b
=
1
2
i
+ 1
m
_
h(
i
, L
, m)
2
C
L
D
b
B
2
_
, (4.88)
2
d
i
D
b
D
b
2
<

C
i
>
D
b
D
b
=
1
2
i
+ 1
m
_
h(
i
, L
, m)
2
2
C
L
D
b
D
b
B
2
_
. (4.89)
Obviously for this binning, the double derivative of d disappears.
Since the

C
are coupled, one can not use all multipoles in the datavector,
the covariance matrix would in this case become singular. One has to choose a
number N
in
of multipoles
i
for which one nds d
i
. How many multipoles to use
depends on how tight the

C
are coupled which depends on the size of the

kernel (gure (4.5)). I found that N
in
2/3
max
/ seems to be optimal. To
use a lower N
in
increases the error bars on the estimates and a higher N
in
will
not improve the estimates. One can at most t for as many C
s as the number of
(N
in
) one has used in the analysis. So one needs to nd a number N
bin
N
in
of bin values D
b
from which one can construct the full sky C
.
I will now describe some test simulations to see how the method works. As
a rst test, I used the same model as was used in (Hivon et al. 2001), with
total
= 1,
= 0.7,
b
h
2
= 0.03 and n
s
= 0.975. These are the parameters from
the combined Maxima-Boomerang analysis (Jae et al. 2001). I used a circular
patch with 15.5
radius covering the same fraction of the sky as in (Hivon et al.

2001). Using HEALPix I simulated a sky using a standard CDM power spec-
trum with l
max
= 1024 and a 7
pixel size (N
side
= 512 in HEALPix language). I
smoothed the map with a 10
beam and added non-correlated non-uniform noise

to it. Here a Gaussian Gabor window with FWHM = 12
was used with a

cut-o
C
= 3. For the likelihood estimation, I had N
bin
= 20 full sky C
bins
and N
in
= 100

C
values between = 2 and = 960. In gure (4.13) one can

see the result. The shaded areas are the expected 1 variance with and without
noise. These were found from the approximate theoretical formula for a uniform
noise model (using the formula in (Hivon et al. 2001)). The formula is similar to
the one used in most publications and is in this case a very good approximation
even with non-uniform noise. In the next example however I will show that the
formula has to be used with care. In the gure, the error bars on the estimates
are taken from the Fisher matrix and the signal-to-noise ratio S/N = 1 at = 575.
Figure 4.13: The analysis of an input model with
total
= 1,
= 0.7,
b
h
2
=
0.03 and n
s
= 0.975. I used a non-uniform white noise model with S/N =
1 at = 575. The dotted line is the input average full sky power spectrum
and the histogram shows the binned pseudo power spectrum for this realisation
(without noise). I used N
bin
= 20 bins and N
in
= 100 input sample points to the
likelihood. The shaded areas about the binned average full sky spectrum (which
is not plotted) are the theoretical variance with and without noise. The bright
shaded area shows cosmic and sample variance and the dark shaded area also has
variance due to noise included. The variance due to noise was found from the
approximate formula for uniform noise. The 1 error bars on the estimates are
taken from the inverse Fisher matrix. The solid line increasing from the left to
the right is the noise power spectrum.
In gure (4.14), I have plotted the average of 1000 simulations, with dierent
noise and sky realisations. From the plot, the method seems to give an unbiased
estimate of the power spectrum bins D
b
. For the lowest multipoles the estimates
are slightly lower than the binned input spectrum. This is a result of the slightly
skewed probability distribution of

C
for small windows at these low multipoles

(see gure (4.9) and (4.10)). The probability that the

C
at lower multipoles
have a value lower than the average <

C
> is high and the assumption about a

Gaussian distribution about this average leads the estimates to be lower. When
a bigger area of the sky is available such that several patches can be analysed
jointly to give the full sky power spectrum, this bias disappears. This will be
shown in section (4.3.1).
In this example one can see that the 1 error bars from Monte Carlo coincide
very well with the theoretical error shown as shaded areas from the formula in
(Hivon et al. 2001). Note that the error bars on the higher are smaller than in
(Hivon et al. 2001) because the noise model used in that paper was not white.
Also they took into account errors due to map making which is not considered
here.
Figure 4.14: Same as in gure (4.13) but this is an average over 1000 simulations
and estimations. The histogram is now the binned average full sky power spec-
trum. The error bars on this plot are the 1 variances taken from Monte-Carlo.
Figure 4.15: The plots show average signal and noise pseudo power spectra plot-
ted separately. The spectra are normalised so that they canbe compared to the
full sky power spectrum. The solid and dashed curves which almost fall together
are the signal pseudo power spectra for a 15 degree FWHM Gaussian Gabor win-
dow and a corresponding tophat window respectively. In the upper plot the noise
model showed in gure (4.16) was used. This noise model is increasing from the
north pole and down to the edges of the patch. This is opposite of the Gaussian
window and for this reason the Gaussian window gets higher signal to noise ratio.
The solid horizontal line in the upper plot shows the noise pseudo power spectrum
for the Gaussian window and the dashed horizontal line shows the noise pseudo
power spectrum for the tophat window. In the lower plot a uniform noise model
was used so that the noise pseudo power spectra fall together and are showed as
a solid vertical line. The gure shows how a Gaussian window can be used to
increase signal to noise.
As a next test, I used a simulation with the same resolution and beam size.
The power spectrum was this time a standard CDM power spectrum. I used an
axissymmetric noise model with noise increasing from the centre and outwards
to the edges (see gure (4.16)). This is the kind of noise model which could be
expected from an experiment scanning on rings, with the rings crossing in the
center. I now use a circular patch with 18.5
radius and a FWHM = 15
Gaus-
sian Gabor window cut at
C
= 3 (radius 18
). An interesting point now

is that the Gabor window is decreasing from the center and outwards, which is
opposite of the noise pattern. This gives the pixels with low noise high signicant
in the analysis and the pixels with high noise low signicance. One sees from the
expressions for the signal and noise pseudo power spectra that the Gabor window
will work dierently on both. This means that S/N is dierent depending on the
Gabor window. For this case, I have plotted the average pseudo power spectrum
for signal and noise separately in gure (4.15). This shows the described eect.
The S/N ratio is much higher for the Gaussian Gabor window in this case, fa-
voring the use of this window for the analysis.
Figure 4.16: The noise map with noise increasing from the north pole and down-
wards. The gure shows a gnomic projection with the north pole in the centre.
Figure 4.17: Same as gure (4.13) but for a standard CDM model. The noise
is increasing from the center and out to the edges while the Gaussian Gabor
window has the opposite eect, giving an increased signicance to pixels with
less noise. As in gure (4.13) the shaded areas show the analytically calculated
variance using the naive formula for the uniform noise case. The dashed lines
show the expected variance using the inverse of the Fisher matrix.
Figure 4.18: Same as gure (4.17) but with 5000 simulations and estimations
averaged. The histogram now shows the binned average full sky power spectrum.
The error bars on the esitmates are here the 1 averages from Monte Carlo.
Again I used N
bin
= 20 and N
in
= 100. The result is shown in Figure (4.17).
In gure (4.18) the average over 5000 simulations and estimations is shown. One
can see that the estimate also does well beyond = 520 which is where the ef-
fective S/N = 1. The method is clearly still unbiased. The error bars in the
part where noise dominates are here lower than the theoretical approximation
for uniform noise showed as the dark shaded area. The dashed lines show the
theoretical 1 variance taken from the inverse Fisher matrix which here gives a
very good agreement with Monte Carlo.
Figure 4.19: The average correlation matrix N(b, b
) < D
b
D
b
> /(< D
b
><
D
b
>) 1 of the estimates in gure (4.18). The negative elements are bright
coloured.
In gure (4.19) I show the average (over 5000 estimations) of the correla-
tion between the estimates D
b
between dierent bins. The gure shows that the
correlations between estimates are low and in fact in each line all o-diagonal
elements are more than an order of magnitude lower than the diagonal element
of that line. In Figure (4.20) I show that the probability distribution of the esti-
mates in Figure (4.18) are almost Gaussian distributed.
Figure 4.20: The probability distribution of the estimates D
b
in gure 4.18. The
variable x is given as x = (D
b
< D
b
>)/
_
< (D
b
< D
b
>)
2
>. The dashed
line is a Gaussian with mean and standard deviation taken from Monte-Carlo.
The plot shows bin estimates centered at = 225, = 425, = 525 and = 725.
To test the method at higher multipoles I also did one estimation up to mul-
tipole = 2048. I used HEALPix resolution N
side
= 1024 and simulated a sky
with a 8
Gaussian beam and added non-uniform noise. Both the beam and noise
level were adjusted according to the specications for the Planck HFI 143GHz
detector (Bersanelli et al. 1996). I used again a 15
FWHM Gaussian Gabor

window cut at 3 away from the centre. In the estimation I used N
bin
= 40 bins
and N
in
= 200 input

C
between = 7 and = 2048. The average of 100 such

simulations is shown in gure (4.21). Each complete likelihood estimation (which
includes a total of about 25 likelihood evaluations) took about 8 minutes on a
single processor on a 500MHz DEC Alpha work station.
Figure 4.21: Same as gure (4.14) for 100 simulations where the beam and noise
level was set according to the specications of the Planck 143GHz channel. Again
a 15
FWHM Gaussian Gabor window was used. The power spectrum was esti-
mated in 40 bins between = 2 and = 2048
Figure 4.22: The average of 300 estimations where the input

C
were taken from

simulations with a xed CMB realisation but varying noise realisations. The
dotted line is the N
in
input

C
from the CMB realisation without noise. The

histogram is the same spectrum binned in N
bin
bins. The dashed line is the
binned average full sky power spectrum from which this realisation was made.
The shaded areas around the binned full sky spectrum show the variance with
(dark) and without (bright) noise. The solid line rising from the left to the right
is the noise power spectrum.
In Figure (4.22), I have plotted the average of 300 estimations. The data
input in these 300 estimations were the

C
from simulations with a xed CMB

realisation and varying noise realisation. The dotted line shows the N
in

C
s used
as input to the likelihood but without the noise. The histogram is as before, the
input pseudo spectrum without noise binned in N
bin
bins. This means that each
histogram line shows the average of the dotted line over the bin. One can see
that the result is partly following the N
in
input

C
and partly the binned power

spectrum.
Finally, I made a comparison between a tophat window and a Gaussian Gabor
window. In this case I used uniform noise, so that the Gaussian and tophat
Gabor windows have the same S/N ratio which I set to 1 at = 520. I used a
disc with 18
radius, N
in
= 200 and N
bin
= 20. In Figure (4.23) one can see
the result. The lower plot shows the estimates with the Gaussian Gabor window
(15
FWHM) and the upper with the tophat window. The Gaussian window is
suppressing parts of the data and for this reason gets a higher sample variance
than the tophat. This eect is seen in the plot. Clearly when no noise weighting
is required the tophat window seems to be the preferred window (which was
also discussed in (Hivon et al. 2001)). This chapter has been concentrating on
the Gaussian window to study power spectrum estimation in the presence of a
window dierent from a tophat. It has been shown that a dierent window can
be advantageous when the noise is not uniformly distributed as one can then give
data with dierent quality dierent signicance.
4.3 Extensions of the Method 134
Figure 4.23: Estimates of

C
using a Gaussian Gabor window (lower plot) and a

tophat window (upper plot). Here I used a uniform noise model with S/N = 1
at = 520. The dotted line shows the average full sky power spectrum and
the histogram shows the input pseudo power spectrum without noise for this
realisation, binned in the same way as the estimates. The bright shaded area
shows the cosmic and sample variance around the binned average spectrum (not
plotted). The dark shaded area has the variance due to noise included. The 1
error bars on the estimates are taked from the inverse Fisher matrix. The solid
line increasing from left to right is the noise power spectrum.
4.3 Extensions of the Method
In this section I will discuss two possible extensions of the method. The for-
malism for the extensions are worked out and some simple examples are shown.
Further investigations of these extensions are left for future work.
4.3.1 Multiple Patches
It has been shown how one can do power spectrum estimation on one axissym-
metric patch on the sky. The next question that arises is what to do when the
observed area on the sky is not axissymmetric. In this case one can split the area
into several axissymmetric pieces and use the

C
from each piece. Then these

C
from all the patches are used together in the likelihood maximization. The rst
thing to check before embarking on this idea is the correlation between

C
s in
dierent patches.
Suppose one has two axissymmetric Gabor windows, G
A
() and G
B
(), cen-
tered at two dierent positions A and B on the sky. Suppose also that the
rotation operators

D
A
and

D
B
will rotate these patches so that the centers are
on the north pole. Considering patch A, one can dene,
a
A
m
=
_
G
A
0
()
_
D
A
T()
_
Y
m
(), (4.90)
where G
A
0
is the window G
A
rotated to the north pole. Since T() =

m
a
m
Y
m
(),
one gets that
D
A
T() =
m
a
m
D
lA
m
m
Y
m
(). (4.91)
Here the D
m
coecients are described in appendix (A). One now gets,
a
A
m
=
A
m
m
h
A
(,
, m)
mm
(4.92)
=
m
D
A
mm
h
A
(,
, m), (4.93)
where h
A
(,
, m) is just the h(,
, m) function for the Gabor window G

A
().
The next step is to nd the correlations between

C
A
and

C
B
, dened for
patch A as,
C
A
m
a
A
m
a
A
m
2 + 1
. (4.94)
Following the procedure I used for a single patch one gets,
<

C
A
C
B
> =
1
(2 + 1)(2
+ 1)
mm
< a
A
m
a
A
m
a
B
m
a
B
m
>
= <

C
A
><

C
B
> +
2
mm
[ < a
A
m
a
B
m
> [
2
(2 + 1)(2
+ 1)
. (4.95)
One can use the expression for a
A
m
to nd,
< a
A
m
a
B
m
> =
< a
m
a
M
> D
A
mm
D
L
B
m
M
(4.96)
h
A
(,
, m)h
B
(
, L
, m
) (4.97)
=
A
mm
D
B
m
_
. .
D
mm
()
h
A
(,
, m)h
B
(
, m
) (4.98)
=
mm
()h
A
(,
, m)h
B
(
, m
), (4.99)
where is the angel between the centers of the patches. Relations from appendix
(A) were used here.
The next step is to see what happens when noise is introduced. I assume that
the noise is uncorrelated. The noise in pixel j is n
j
and < n
j
n
j
>=
jj

2
j
. From
above one has,
C
A
=

C
AS
+

C
AN
+

C
AX
, (4.100)
where
C
AN
m
a
NA
m
a
NA
m
2 + 1
(4.101)
C
AX
m
a
NA
m
a
SA
m
2 + 1
(4.102)
a
AN
m
=
j
G
A
j
n
A
j
Y
j
m
, (4.103)
where the last sum is over pixels, G
A
j
and n
A
j
being the window and noise for
pixel j respectively.
The correlation between the two patches then becomes,
<

C
A
C
B
> <

C
A
><

C
B
>= M
S
+M
N
+ <

C
AX
C
BX
>, (4.104)
where,
M
S
=
2
(2 + 1)(2
+ 1)
mm
[ < a
AS
m
a
BS
m
> [
2
(4.105)
M
N
=
2
(2 + 1)(2
+ 1)
mm
[ < a
AN
m
a
BN
m
> [
2
. (4.106)
Finally,
<

C
AX
C
BX
>=
4
(2 + 1)(2
+ 1)
mm
< a
AS
m
a
BS
m
>< a
AN
m
a
BN
m
> . (4.107)
Now one needs an expression for < a
AN
m
a
BN
m
>. One gets,
< a
AN
m
a
BN
m
>=
jj
G
A
j
G
B
j
< n
A
j
n
B
j
> Y
j
m
Y
j
m
. (4.108)
Here there are only correlations between overlapping pixels. If there are no over-
lapping pixels between the patches, this term is zero. Otherwise this can be
written as a sum over the overlapping pixels
< a
AN
m
a
BN
m
>=
j
G
A
j
G
B
j

2
j
Y
j
m
Y
j
m
. (4.109)
After the expression (4.95) was tested with Monte Carlo simulations, I com-
puted the correlations between

C
for two patches A and B where I varied the

distance between the centers of A and B. I used a standard CDM power spec-
trum and both patches A and B had a radius of 18
apodised with a 15
FWHM
Gaussian Gabor window. In gure (4.24) I have plotted the diagonal of the nor-
malised correlation matrix (<

C
A
C
B
> <

C
>
2
)/ <

C
>
2
. The angels I
used were 6
, 12
, 24
, 30
, 36
and 180
. One sees clearly how the correlations

drop with the distance. In the two last cases, there were no common pixels in
the patches. As one could expect, the correlations for the largest angels (the rst
few multipoles) do not drop that fast.
Figure 4.24: The correlation between

C
between two patches A and B with an

angular distance between the centers. A normal CDM power spectrum was used
and the patches had an 18
radius apodised with a 15
FWHM Gaussian Gabor

window. The gure shows the diagonal of the normalised correlation matrix
(<

C
A
C
B
> <

C
>
2
)/ <

C
>
2
where of course < C
>=< C
A
>=< C
B
>
The angles used are (from top to bottom on the gure) 0
, 6
, 12
, 18
, 30
, 36
and 180
.
In gure (4.25) I have plotted two slices through the correlation matrix of

C
for a single patch at = 400 and = 800. On the top I plotted the diagonals
of the correlation matrices for separation angle = 30
, = 36
and = 180
.
One sees that for the case where the patches do not have overlapping pixels, the
whole diagonals have the same level as the far-o-diagonal elements in the = 0
matrix. When doing power spectrum estimation on one patch, the result did not
change signicantly when these far-o-diagonal elements were set to zero. For
this reason one expects that when analysing several patches which do not overlap
simultaneously, the correlations between non-overlapping patches do not need to
be taken into account. Note however that for the = 30
which means that there

are only a few overlapping pixels, the approximation will not be that good as the
level is orders of magnitude above the far-o-diagonals of the = 0
matrix.
Another thing to note is that for the lowest multipoles, the correlation between
patches is still high but I will also assume this part to be zero and attempt a
joint analysis of non-overlapping patches.
Figure 4.25: Cuts through the correlation matrices which diagonals are shown
in gure (4.24). The solid lines (thin and thick) show a cut through the = 0
correlation matrix at = 400 and = 800 respectively. The dashed lines (thin
and thick) show the diagonal of the correlation matrices for = 36
and = 30
respectively. The dotted line is the diagonal of the = 180
matrix.
The full correlation matrices for 0 and 30 degree separation are shown in
gure (4.26). The gures show how the diagonal is dropping relative to the far
o-diagonal elements.
Figure 4.26: The gure shows the correlation matrices M(,
) normalised (<
C
A
C
B
> < C
A
>< C
B
>)/(< C
A
>< C
N
>) between pseudo spectrum

coecients for two patches A and B of 18
radius and with 0 degree (left plot)

and 30 degree separation. A Gaussian Gabor window with 15 degree FWHM was
used. The aim of the plot is to show how correlations between

C
from dierent
patches drop when the distance between the two patches is about the FWHM of
the Gaussian kernel.
In gure (4.27) the full correlation matrices for 36 and 180 degree separation
is shown. For 36 degree separation one can see that the diagonal has almost
disappeared with respect to the rest of the matrix whereas for 180 degree the
diagonal has vanished completely. But the wall at low multipoles remains.
Figure 4.27: This gure shows the same as gure (4.26) but for 36 and 180 degree
separation of the patches.
In gure (4.28), I did a separate C
estimation on 146 non-overlapping patches

with radius 18
apodised with a 15
FWHM Gaussian Gabor window. The

patches where uniformly distributed over the sphere and and uniform noise was
added to the whole map. The gure shows the average of the 146 C
estimates.
One can see that the estimate seems to be approaching the full sky power spec-
trum even at small multipoles.
Figure 4.28: The average of 146 individual power spectrum estimations of 146
non-overlapping patches on the same CMB map with uniform noise added to it.
The patches all had radius 18
degrees apodised with a 15
FWHM Gaussian
Gabor window. The histogram shows the binned average of the 146

C
from the
dierent patches without noise. The dotted line is the average full sky power
spectrum and the shaded areas around the binned full sky power spectrum (not
plotted) show the theoretical 1 variance with (dark) and without (bright) noise.
The solid line rising from the left to the right is the noise power spectrum.
Figure 4.29: The result of a joint analysis of 146 patches on the same CMB sky.
The solid line shows the average full sky power spectrum, the histogram shows
the binned full sky power spectrum and the shaded boxes show the expected 2
deviations due to noise, cosmic and sample variance. The sizes of the shaded
boxes were calculated from the approximate formula for uniform noise. The dots
show the estimates with 2 error bars taken from the Fisher matrix. As before
the rising solid line is the noise power spectrum and the vertical line shows where
S/N = 1.
Finally I made a joint analysis of all the 146 patches. The idea was to extend
the datavector in the likelihood so that it contained all the

C
from all the 146

patches. The datavector can then be written as d = d
1
, d
2
, ..., d
146
where d
i
now denotes the whole datavector for patch number i. From the results above
it seems to be a good approximation to assume that the correlation between
from dierent patches is zero so that the correlation matrix will be block
diagonal. Each block is then the correlation matrix for each individual patch.
The log-likelihood can then simply be written as
L =
146
i=1
d
T
i
M
1
i
d
i
+
146
i=1
ln det M
i
, (4.110)
where M
i
is the correlation matrix for patch number i. In gure (4.29) the result
of this joint analysis is shown. One can see that the full sky power spectrum is
well within the two sigma error bars of the estimates.
The method of combining patches on the sky for power spectrum analysis will
be developed further in a forthcoming paper (G orski and Hansen 2002).
4.3.2 Monte Carlo Simulations of the Noise Correlations
and Extention to Correlated Noise
The computation of the noise correlation matrix in the general case takes
_
N
pix
l
2
max
(N
in
)
2
which is approximately N
3/2
pix
(N
in
)
2
. When N
in
is getting large this can be cal-
culated quicker using Monte Carlo simulations (as was done in (Hivon et al.
2001)). Finding the

C
N
from one noise map takes O(N

3/2
pix
) operations so using
Monte Carlo simulations to nd the whole noise matrix takes O(N
3/2
pix
N
sim
) op-
eration where N
sim
is the number of Monte Carlo simulations needed. So when
N
sim
<< (N
2
in
) it will be advantageous using MC if this gives the same result.
Also when the noise gets correlated, the analytic calculation of < a
N
m
a
N
m
>
will be very expensive. In this case another method for computing < a
m
a
m
>
will be necessary and Monte Carlo simulations could also prove useful. For a
given noise model several noise realisations can be made and averaged to yield
the noise correlation matrix and the < a
m
a
m
> term needed in the estimation
process. Using Monte Carlo the operations count for precalculations of noise
properties will still be O(N
3/2
pix
) also with correlated noise assuming that each
noise realisation for a given noise model can be calculated quickly.
Figure 4.30: Same as gure (4.18) but with a dierent noise model. Here the
average of 100 estimations is shown. The big dots in the middle of the bins
show the result using analytical expressions for the noise matrices. The crosses
on the left side of each big dot show the results of using noise matrices from
N
sim
= 1000 MC simulations. The crosses on the right side are for N
sim
= 20000
noise simulations.
In gure (4.30), the result of C
estimation with noise matrix and < a

N
m
a
N
m
>
computed with Monte Carlo is shown. Again a standard CDM power spectrum
was used with a non-uniform white noise model and a 15
FWHM Gaussian Ga-

bor window. In the C
estimation N
in
= 200

C
were used and N

bin
= 20 power
spectrum bins were estimated. The noise matrices were calculated using (1) the
analytical expression, (2) MC with N
sim
= 20000 and (3) MC with N
sim
= 1000.
In gure (4.31), a cut through the correlation matrices for the dierent cases is
shown for = 500. The dashed line (case (3)) follows the solid line (case (1)) to a
level of about 10
2
of the diagonal. The dotted line (case (2)) is roughly correct
to about 10
1
times the value at the diagonal.
Figure 4.31: A cut through the noise correlation matrix at multipole = 500.
The correlation matrix was evaluated using the analytical formulae (solid line),
20000 MC simulations (dashed line) and 1000 MC simulations (dotted line). The
matrix is here normalised to be 1 at the diagonal.
I did 100 estimations for each case and the average result is plotted in gure
(4.30). The big dots are the results from case (1), the crosses on the right hand
side are the results from (2) and the crosses on the left hand side the results
from (3). The average estimates seem to be consistent, only in the highly noise
dominated regime they start to deviate. For case (2), the error bars are for some
multipoles higher and for some lower than the analytic case. The dierences
are at most 3%. I conclude that using this many simulations, the error bars do
not increase signicantly over the analytic case. For case (3) the error bars are
up to 17% higher (and only higher) than the analytic case. It seems that 1000
simulations was not sucient to keep the same accuracy of the estimates as when
using analytic noise matrices. To keep the error bars as low as possible it seems
that N
sim
= 20000 is a reasonable number of simulations.
4.4 Discussion 147
4.4 Discussion
In this chapter I have discussed the Gabor transform on the sphere and how this
can be used for CMB power spectrum estimation. A powerful tool has been pre-
sented for estimation of the full sky CMB power spectrum using small patches on
the sky. The method is very fast an gives an unbiased estimate for the multipoles
for which the Gaussian likelihood ansatz is valid. For the lowest multipoles a
small downwards biasing is observed for small windows due to the skewed distri-
bution of

C
. This can easily be solved by using information about low multipole

C
from larger patches. When estimating the power spectrum for larger parts of
the sky this small bias disappears.
The method has been demonstrated to work very well on a single azimuthally
symmetric patch on the CMB sky. It was also shown how the method can be
extended to non-symmetric patches or even full sky by combining several patches
in a joint likelihood analysis. This was only tested for the uniform noise case but
will be explored further in future work. Also the case with correlated noise should
in principle work using Monte Carlo methods to precompute noise matrices. This
was tested for uncorrelated noise and compared to the analytic precalculation
scheme. The results were getting more and more similar as the number of Monte
Carlo simulations was increased. This is another extension which needs more
work in the future.
4.4 Discussion 148
Chapter 5
Gabor Transform on the
Polarised CMB Sky
As discussed in chapter (1) and (2) the observation of the CMB polarisation
power spectra will help breaking some parameter degeneracies. Several dierent
sets of values for the cosmological parameters can give the same CMB temper-
ature power spectrum. The polarisation power spectra will however be dierent
and can distinguish the dierent models. Also the error bars on the cosmological
parameters can be reduced by also exploiting the information present in the CMB
polarisation power spectra.
Much eort has been made recently in order to nd methods to analyse the
CMB temperature power spectrum. For the even harder task of estimating the
polarisation power spectra there has been very few publications. The framework
for analysing the polarisation power spectra has been set in (Zaldarriaga and
Seljak 1997; Kamionkowski, Kosowsky, and Stebbins 1997) but these papers only
describe the full likelihood method which is far too time consuming also when
only considering the temperature power spectrum.
In this chapter I will extend the method of using the pseudo power spectrum
as input to a likelihood estimation of the power spectrum. I will include the E
and C mode polarisation pseudo power spectra in the datavector and use tech-
niques similar to those described in the previous chapter to estimate the power
spectra. This can be done because the kernels that connect the full sky polari-
sation power spectra with the polarisation pseudo power spectra on an apodised
sky are similar to the kernel for the temperature power spectrum. In the rst part
of this chapter I will derive the formulae for these kernels and for the polarisation
pseudo power spectra and discuss their shapes. Then in the second part this will
be used for likelihood estimation.
In this chapter the B component polarisation will mostly be neglected. The
5.1 The Gabor Transformation 150
B polarisation power spectrum is expected to be very small and will hardly be
detectable by the MAP or Planck experiments. Also the E and B components
of polarisation mixes on the cut sky as will be discussed in this chapter, making
the B polarisation pseudo spectrum to be dominated by the E component. This
was also found independently by (Lewis, Challinor, and Turok 2001; Chiueh and
Ma 2001).
5.1 The Gabor Transformation
5.1.1 Polarisation Powerspectra
As described in the rst chapter the polarisation spherical harmonic coecients
are dened by means of the tensor spherical harmonics
2
Y
m
( n) as
a
2,m
=
_
d n
2
Y
m
( n)(Q+iU)( n), (5.1)
a
2,m
=
_
d n
2
Y
m
( n)(QiU)( n), (5.2)
and the inverse transforms are given as
(Q+iU)( n) =
a
2,
2
Y
m
( n) (5.3)
(QiU)( n) =
a
2,
2
Y
m
( n). (5.4)
It will be advantageous to write these spherical harmonics in terms of the rotation
matrices D
mm
dened in Appendix (A). Using the formulae in Appendix (B)
one can write
2
Y
m
( n) =
2 + 1
4
D
2m
(, , 0), (5.5)
2
Y
m
( n) =
2 + 1
4
D
2m
(, , 0). (5.6)
(5.7)
The corresponding complex conjugates can be written as (using the relations in
Appendix (A))
2
Y

m
( n) =
2 + 1
4
D
2m
(, , 0) (5.8)
=
2 + 1
4
(1)
m
D
2m
(, , 0), (5.9)
2
Y

m
( n) =
2 + 1
4
D
2m
(, , 0) (5.10)
=
2 + 1
4
(1)
m
D
2m
(, , 0). (5.11)
Finally as explained in chapter (1) the power spectrum can be written in terms
of a divergence free E component and a curl free B component
a
E,m
=
1
2
(a
2,m
+a
2,m
), (5.12)
a
B,m
=
1
2
i(a
2,m
a
2,m
) (5.13)
Now I will dene the windowed coecients a
m
and

C
for polarisation in
an analogous way as for temperature. As in chapter (4) I Legendre expand the
Gabor Window G() which is an axissymmetric function centered at n
0
,
G() =
+ 1
4
g
(cos ) =
m
( n)Y
m
( n
0
). (5.14)
=
+ 1
4
D
0m
(, , 0)Y
m
( n
0
). (5.15)
I dene the windowed coecients a
m
as
a
2,m
=
_
d n
2
Y
m
( n)(Q+iU)( n)G( n, n
0
) (5.16)
Using the expression for (5.3) (Q+iU)( n) and writing all
2
Y
m
as D-matrices
using expressions (5.5), (5.6), (5.9) and (5.11) one gets,
a
2,m
=
a
2,
_
(2 + 1)(2
+ 1)(2
+ 1)
(4)
3/2
(1)
m
(5.17)
Y
m
( n
0
)
_
d nD
2m
(, , 0)D
2m
(, , 0)D
0m
(, , 0) (5.18)
=
a
2,
_
(2 + 1)(2
+ 1)(2
+ 1)
(4)
3/2
(1)
m
1
2
(5.19)
Y
m
( n
0
)
_
d ndD
2m
(, , )D
2m
(, , )D
0m
(, , ) (5.20)
=
a
2,
m
h
2
(,
, m, m
, n
0
). (5.21)
By using equation (5.16), one can also write this as,
a
2,n
=
a
2,
_
d nG( n, n
0
)
2
Y
m
( n)
2
Y
m
( n). (5.22)
Using the two last expressions, the h
2
function can be written in two ways
(using relation (C.2) for the last expression),
h
2
(,
, m, m
, n
0
)
_
d nG( n, n
0
)
2
Y

m
( n)
2
Y
m
( n) (5.23)
=
(2 + 1)(2
+ 1)(2
+ 1)
4
(1)
m
(5.24)
Y
m
( n
0
)
_

2 2 0
__

m m
_
. (5.25)
As I soon will show, the polarisation pseudo power spectra are rotationally
invariant under rotation of the Gabor window. For that reason, one can put the
centre of the Gabor window on the north pole giving,
a
2,m
=
a
2,
m
h
2
(,
, m), (5.26)
where,
h
2
(,
, m) = h
2
(,
, m, m, 0) (5.27)
=
_
(2 + 1)(2
+ 1)(2
+ 1)
4
(5.28)
(1)
m
_

2 2 0
__

m m 0
_
. (5.29)
Similarly one gets,
a
2,m
=
a
2,
m
h
2
(,
, m) (5.30)
and
a
E,m
=
1
2
( a
2,m
+ a
2,m
) (5.31)
=
a
E,
m
H
2
(,
, m) +i
a
B,
m
H
2
(,
, m) (5.32)
a
B,m
= i
1
2
( a
2,m
a
2,m
) (5.33)
=
a
B,
m
H
2
(,
, m) i
a
E,
m
H
2
(,
, m) (5.34)
Please note that whereas h(,
, m) = h(,
, m) a similar relation does not

exist for h
2
(,
, m). Using the expression above, one has that,

h
2
(,
, m) =
_
(2 + 1)(2
+ 1)(2
+ 1)
4
(1)
m
(5.35)
2 2 0
__

m m 0
_
(1)
+
(5.36)
The reason why this is not equal to h
2
(,
, m) is that the rst Wigner symbol

is not zero when +
is even, which is the case when the whole lower row

in the Wigner symbol is 0, as in the case with h(,
, m). It is also obvious from

the expression (5.22). For the scalar case, the relation Y
(m)
( n) = (1)
m
Y
m
( n)
ensures that there is no dependency on m in h(,
, m) whereas a similar relation

does not exist for the tensor harmonics (but see relation (B.5)).
I have further dened,
H
2
(,
, m) =
1
2
(h
2
(,
, m) +h
2
(,
, m)) (5.37)
H
2
(,
, m) =
1
2
(h
2
(,
, m) h
2
(,
, m)) (5.38)
which contrary to h
2
(,
, m) have an m-symmetry
H
2
(,
, m) = H
2
(,
, m) (5.39)
H
2
(,
, m) = H
2
(,
, m) (5.40)
To nd the

C
and (later) the correlation matrices, the following quantities

will be needed
< a
E,m
a
E,
,m
> =
mm
C
E
H
2
(,
, m)H
2
(
, m) (5.41)
+
C
B
H
2
(,
, m)H
2
(
, m)
_
(5.42)
< a
B,m
a
B,
,m
> =
mm
C
B
H
2
(,
, m)H
2
(
, m) (5.43)
+
C
E
H
2
(,
, m)H
2
(
, m)
_
(5.44)
< a
E,m
a
B,
,m
> =
mm
i
_
C
E
H
2
(,
, m)H
2
(
, m) (5.45)
+
C
B
H
2
(,
, m)H
2
(
, m)
_
(5.46)
< a
E,m
a
,m
> =
mm
C
C
H
2
(,
, m)h(
, m) (5.47)
< a
B,m
a
,m
> = i
mm
C
C
H
2
(,
, m)h(
, m) (5.48)
For <

C
E
> one now has,

<

C
E
>=
m
< a
E,m
a
E,m
>
2 + 1
=
C
E
K
2
(,
) +
C
B
K
2
(,
). (5.49)
Where,
K
2
(,
) =
1
2 + 1
m
H
2
2
(,
, m) (5.50)
Using the expression for h
2
(,
, m), one gets

K
2
(,
) =
1
2
1
2 + 1
m
_
h
2
2
(,
, m) h
2
(,
, m)h
2
(,
, m)
_
(5.51)
=
g
L
(2
+ 1)(2
+ 1)(2L
+ 1)
32
2
(5.52)
2 2 0
__

2 2 0
_
(1 (1)
+
) (5.53)
m
_

m m 0
__

m m 0
_
. .
(2
+1)
1
(5.54)
=
g
2
(2
+ 1)(2
+ 1)
32
2
_

2 2 0
_
2
(5.55)
(1 (1)
+
). (5.56)
Similarly,
<

C
B
> =
C
B
K
2
(,
) +
C
E
K
2
(,
) (5.57)
<

C
C
> =
C
C
K
20
(,
) (5.58)
K
20
(,
) =
1
2 + 1
m
H
2
(,
, m)h(,
, m) (5.59)
=
g
2
(2
+ 1)(2
+ 1)
16
2
_

2 2 0
__

0 0 0
_
. (5.60)
Before studying these kernels I will rst show the rotational invariance. To show
that <

C
E
>, <

C
B
> and <

C
C
> are rotationally invariant under rotations of

the window, one can keeps the dependency on the angle and the expression for
the kernel turns out to be,
K
2
(,
, n
0
) =
1
2
1
2 + 1
mm
_
h
2
2
(,
, m, m
, n
0
) (5.61)
h
2
(,
, m, m
, n
0
)h
2
(,
, m, m
, n
0
)
_
(5.62)
=
g
L
(2
+ 1)
_
(2
+ 1)(2L
+ 1)
8
(5.63)
Y
m
( n
0
)Y
M
( n
0
)
_

2 2 0
__

2 2 0
_
(5.64)
mm
m m
__

m M
_
. .
(2
+1)
1
L

m
(5.65)
(1 (1)
+
) (5.66)
=
g
2
+ 1
8
_

2 2 0
_
2
[Y
m
( n
0
)[
2
. .
2
+1
4
(5.67)
(1 (1)
+
) (5.68)
= K
2
(,
). (5.69)
This is independent on the angle n
0
which shows the rotational invariance of the
pseudo polarisation power spectra.
As with the temperature kernels, the polarisation kernels can be evaluated
either using the analytical Wigner symbol expressions (5.55) and (5.58) or faster
by direct integration and recursion of the H
2
functions. The recursion for the
h
2
(,
, m, m
) functions in appendix (E) is one of the major results in the thesis.

This is an extension of the recursion for h(,
, m, m
) in appendix (D).
Studying equation (5.49) and (5.57) one sees that the E and B modes are
mixing when only a portion of the sky is observed. The kernel K
2
(,
) is the
kernel which takes full sky C
E
modes to the pseudo coecients

C
E
and similarly
for C
B
. The kernel K
2
(,
) is the one which causes the mixing. In gure (5.1)

and (5.2) I have plotted the kernels K
2
(,
) together with K
2
(,
) for a 5 and
15 degree FWHM Gaussian Gabor window with
C
= 3. One can see that the
diagonal of the K
2
(,
) kernel is about an order of magnitude larger than the

diagonal of the mixing kernel K
2
(,
). This means that C

E
would dominate
C
E
and C
B
would dominate

C
B
provided that the two spectra C

E
and C
B
were
of the same order of magnitude. However as discussed before, the C
B
are ex-
pected to be considerably smaller than C
E
in most cosmological models. For the

E mode this is not a problem as C
E
will then hardly be aected. The problem

is the B mode which in this case will be dominated by the E mode.
Figure 5.1: The kernels K
2
(,
) (left plot) and K

2
(,
) (right plot) connecting

full and cut sky polarisation power spectra C
E
and C
B
. The left kernel is the

one which takes full sky C
E
into cut sky

C
E
and full sky C

B
into cut sky

C
B
.
The right kernel is the one which mixes the two giving contributions from full
sky C
E
in cut sky

C
B
and vice versa.

Figure 5.2: The same as gure (5.1) for a 15 degree FWHM Gaussian Gabor
window.
The separation of E and B modes of polarisation on the cut sky was already
discussed in (Lewis, Challinor, and Turok 2001; Chiueh and Ma 2001). Here I
will only note that an easy way of separating the two would be to construct two
new coecients
a
+,m
= a
E,m
+ia
B,m
(5.70)
a
,m
= a
E,m
ia
B,m
, (5.71)
with the corresponding pseudo quantities
a
+,m
=
a
+,
m
(H
2
(,
, m) +H
2
(,
, m)) (5.72)
a
,m
=
a
,
m
(H
2
(,
, m) H
2
(,
, m)) (5.73)
(5.74)
For the power spectra one gets,
< C
+
>
m
< a
+,m
a
+,m
>
2 + 1
=< C
E
m
> + < C
B
m
>, (5.75)
< C
>
m
< a
,m
a
,m
>
2 + 1
=< C
E
m
> < C
B
m
> . (5.76)
(5.77)
To get the pseudo power spectra one can use equations (5.72) and (5.73) to get
<

C
+
> =
C
+
K
+
(,
), (5.78)
<

C
> =
(,
), (5.79)
where the kernels can be written
K
+
(,
) =
1
2 + 1
m
(H
2
(, , m) +H
2
(,
, m))
2
, (5.80)
K
(,
) =
1
2 + 1
m
(H
2
(, , m) H
2
(,
, m))
2
. (5.81)
In gure (5.3) and (5.4) I have plotted the K
2
(,
) and K
2
(,
) kernels for
a tophat window covering the same area on the sky as the Gaussian windows in
gure (5.1) and (5.2).
Figure 5.3: The same as gure (5.1) for a tophat Gabor window covering the
same area on the sky as the Gaussian window in gure (5.1)
Figure 5.4: The same as gure (5.2) for a tophat Gabor window covering the
same area on the sky as the Gaussian window in gure (5.2)
The kernel K
20
(,
) for the cross polarisation power spectrum C

C
is shown
in gure (5.5) for a 5 and 15 degree Gaussian window and in gure (5.6) for the
corresponding tophat windows.
Figure 5.5: The kernel K
20
(,
) connecting the full sky cross polarisation spec-

trum C
C
and the cut sky spectrum

C
C
for a 5 (left plot) and 15 (right plot)

degree FWHM Gaussian Gabor window. The negative elements have a brighter
colour.
Figure 5.6: Same as gure (5.5) for the corresponding tophat windows.
As for the temperature kernels, all the polarisation kernels show the same
behaviour when changing type and size of the window. When going from smaller
to larger windows, the diagonals get sharper. Also the tophat kernels have more
long range correlations than the Gaussian kernels (note that all the plots have
the same vertical scale and can be compared directly).
In gure (5.7) I have plotted a cut through the dierent kernels at = 200 for
comparison. The cuts are made through the kernels for 5 and 15 degree FWHM
Gaussian Gabor windows. The rst thing to note is that the temperature ker-
nel K(,
), the E and B kernel K

2
(,
) and the temperature-polarisation cross

spectrum kernel K
20
(,
) only dier for the far o-diagonal elements. At the

diagonal their shape and size are the same. For this reason the relation shown
in gure (4.5) between the width of the kernel and the width of the size of the
window is also valid for polarisation. This is an important result to be used for
the likelihood estimation of the polarisation power spectra in the next section. It
shows that the number of polarisation pseudo spectrum coecients to be used in
the likelihood analysis should be the same as for the likelihood estimation of the
temperature power spectrum.
In gure (5.8) a similar plot is shown for the corresponding tophat windows.
The plot shows that the conclusions made for the Gaussian windows are also valid
in this case. The shape and size of the three kernels are the same around the
diagonal. For this reason the results shown for the temperature kernel that the
tophat window has larger long range correlations whereas the Gaussian window
has large short range correlations and therefore a wider kernel is also valid for
polarisation.
Figure 5.7: A cut at = 200 through the kernels combining the full sky and cut
sky power spectra. The thick solid line is the kernel K(,
) for the temperature

power spectrum, the thin solid line is the kernel K
2
(,
) for E and B mode po-

larisation and the dotted line is the kernel for the temperature-polarisation cross
power spectrum K
20
(,
). All these kernels go together around the diagonal.

They only dier for the far o-diagonal elements. The dashed line is the mixing
kernel K
2
(,
) which mixes the E and B mode polarisation power spectra on

the cut sky. This kernel is lower than the other kernels. The upper plot is for
a 5 degree Gaussian Gabor window and the lower plot for a 15 degree FWHM
Gaussian window.
Figure 5.8: Same as gure (5.7) but for tophat windows covering the same area
on the sky.
The kernel K
2
(,
) which mixes the E and B modes on the cut sky is plot-

ted as a dashed line in gures (5.7) and (5.8). It is much smaller than the three
other kernels and the shape seems to dier as well. Note that the hight of the
mixing kernel relative to the other kernels is lower for the 15 degree window than
for the 5 degree window. That the size of the mixing kernel relative to the other
kernels is dropping when the size of the window is increasing was to be expected
since in the limit of full sky coverage the E-B mixing disappears and the mixing
kernel must go to zero.
In gure (5.9) a cut at = 200 through the temperature kernel and the mixing
kernel is shown for the 5 and 15 degree FWHM Gaussian Gabor window. The
kernels are normalised to one at the peak so that the shapes can be compared.
For the Gaussian window, the shapes of the kernels still seem to be the same.
But the kernels for the corresponding tophat windows shown in gure (5.10) does
not have a Gaussian shape and diers signicantly from the other kernels.
Figure 5.9: A cut at = 200 through the kernel K(,
) connecting the full sky

temperature power spectrum with the cut sky temperature power spectrum (solid
line) and the kernel K
2
(,
) which is mixing the E and B mode polarisation

power spectra on the cut sky. The upper plot is for a 5 degree FWHM Gaussian
Gabor window and the lower plot for a 15 degree Gaussian window. The kernels
are here normalised to 1 at the peak at = 200 in order to compare the shapes
of the kernels.
Figure 5.10: This gure is the same as gure (5.9) for tophat windows covering
the same area on the sky.
Since the kernels for the polarisation power spectra have a shape similar to
that of the temperature power spectrum the eect of a Gabor window on the
shape of the power spectrum should also be similar. This can be seen in g-
ure (5.11) and (5.12). The gures show the full sky polarisation power spectra
(dashed line) C
E
(gure (5.11)) and C

C
(gure (5.12)) for a standard CDM

model. In this model the B component of polarisation is zero. On top of the
full sky power spectra I have plotted the polarisation pseudo power spectra for
a 5 and 15 degree Gaussian Gabor window (upper and lower plots respectively)
normalised so that it can be compared to the full sky spectrum. The pseudo
spectra for the corresponding tophat windows are plotted as dotted lines. As
expected the shape of the polarisation pseudo spectra relative to the full sky
spectra is similar to that for the temperature spectrum (gure (4.6)). One dier-
ence is that the polarisation pseudo spectra for the Gaussian window do not have
the characteristic extra peak at low multipole which is seen in the temperature
pseudo spectrum. This peak in the temperature spectrum arose due to the steep
1/(+1) fall-o of the temperature spectrum at low multipole. The polarisation
spectra do not have this steep fall-o and for this reason there is no extra peak.
Figure 5.11: The windowed polarisation power spectra

C
E
for a 5 and 15 degree

FWHM Gaussian Gabor Gabor window cut at
C
= 3 (solid line) and for a
tophat window covering the same area on the sky (dotted line). All spectra are
normalised in such a way that they can be compared directly with the full sky
spectrum which is shown on each plot as a dashed line. Only in the rst plot
are all three lines visible. In the three last plots, the full sky spectrum and the
Gaussian pseudo spectrum (dashed and solid line) are hardly distinguishable.
Figure 5.12: Same as gure (5.11) for the temperature-polarisation cross power
spectrum C
C
.
Because of the mixing of E and B modes there is also a B polarisation com-
ponent

C
B
for the pseudo spectrum even when the input full sky C
B
were zero.
This is shown in gure (5.13) where I have plotted the full sky spectrum C
E
and
the pseudo spectra

C
B
for the 5 and 15 degree FWHM Gaussian gabor windows

and corresponding tophat windows. The pseudo spectra are normalised so that
they can be compared directly to the full sky spectrum. The dashed lines show
the pseudo spectra for the Gaussian window. The upper line is for the 5 degree
window and the lower line for the 15 degree window. As expected the size of the
B component is dropping with increasing window size. The

C
B
for the tophat

windows are plotted as dotted lines. The shape of the pseudo spectra

C
B
for
the Gaussian windows are roughly following the shape of the full sky C
E
. This
could be expected because the mixing kernel K
2
(,
) for the Gaussian window

has a Gaussian shape close to the diagonal, similar to the other kernels (see g-
ure (5.9)). The pseudo spectra C
B
for the tophat windows however are much

smoother due to the much broader kernel (gure (5.10)).
Figure 5.13: The full sky C
E
power spectrum plotted together with the spectra
C
B
on the windowed sky. The dashed lines show the B spectra for a 5 and 15
degree FWHM Gaussian Gabor window (upper and lower line respectively). The
dotted lines are for the corresponding tophat windows. The pseudo spectra are
normalised so that they can be compared directly with the full sky spectrum. In
the model used, there was no B polarisation spectrum for the full sky. The

C
B
shown arise due to the mixing of E and B modes on the cut sky only.
In the same way as for the temperature power spectrum, it seems that the
polarisation pseudo spectra resemble the full sky polarisation spectra when the
patches on the sky are large enough. This motivates the use of the polarisation
pseudo power spectra as input to a likelihood estimation of the polarisation power
spectra in the same way as for the temperature power spectrum showed in the
previous chapter.
5.1.2 Rotational Invariance
I now want to show that the pseudo power spectra for polarisation are (as the
temperature power spectra) rotationally invariant under a common rotation of
the sky and window. First, note that the rotation matrices D
mm
are rotating
both the normal spherical harmonics and the spin-s harmonics. This is easy to
show. Assume that one wants to rotate
s
Y
m
( n) with the Euler angles (, , ).
Using the formula for the normal spherical harmonics one gets,
s
Y
rot
m
( n) =
m
(, , )
s
Y
m
( n) (5.82)
=
2 + 1
4
m
(, , )D
sm
(, , 0) (5.83)
= D
sm
(
rot
,
rot
, 0)
2 + 1
4
(5.84)
=
s
Y
m
( n
rot
), (5.85)
where n
rot
is the rotation of the angle n by (, , ). This is clearly general for
all spin-s harmonics. Therefore I use the method from chapter (4) to show that
polarisation pseudo power spectra are rotationally invariant,
Consider a rotation of the sky and window by the angles ( ). Then
the a
s,m
becomes,
a
rot
s,m
=
_
d n[

D( )T( n)G( n)]
s
Y

m
( n). (5.86)
If one makes the inverse rotation of the integration angle n, one can write this
as;
a
rot
s,m
=
_
d nT( n)G( n)[

D
()
s
Y

m
( n)], (5.87)
which is just
a
rot
s,m
=
m
()
_
T( n)G( n)
s
Y
m
( n). (5.88)
The last integral can be identied as the normal a
s,m
.
a
rot
s,m
=
m
() a
s,m
. (5.89)
Thus,
a
rot
E,m
=
m
() a
E,m
a
rot
B,m
=
m
() a
B,m
. (5.90)
For

C
E,m
(and analogously for

C
B,m
and

C
C,m
) one gets
C
rot
E,
=
1
2 + 1
m
a
rot
E,m
a
rot*
E,m
(5.91)
=
1
2 + 1
m
()D
m
() a
E,m
a
E,m
(5.92)
=
1
2 + 1
a
E,m
a
E,m
m
D
m
()D
m
()
. .
mm
(5.93)
=

C
E,
. (5.94)
Since the polarisation power spectra are rotationally invariant, I will in the rest
of the chapter put the centre of the window on the north pole. This makes the
calculations easier while keeping the generality of the results. Now, to do the
likelihood analysis one needs to nd theoretical expressions for the correlations
between dierent

C
Z
(Z=T,E,C).
5.2 Likelihood Analysis
For the temperature power spectrum a Gaussian likelihood ansatz was used (chap-
ter (4)). Because of the similarities between the kernels of the polarisation power
spectra and the temperature power spectrum this is also to be expected for the
polarisation spectra. I will now show the results of some Monte Carlo simulations
conrming this assumption. In this chapter I will assume that the B component
of polarisation is so small that it can be neglected. I will only concentrate on the
T, E and C components of polarisation.
5.2.1 The Form of the Likelihood Function for Polarisa-
tion
In gure (5.14) and (5.15) I have plotted the probability distribution of the

C
E
and

C
C
from 10000 simulations. The probability distribution (histogram) is

plotted on top of a Gaussian (dashed line) with mean and FWHM taken from
the theoretical expressions to be derived in the next section. In these simulations
I was using a 15 degree FWHM Gaussian Gabor window with
C
= 3. In gure
(5.16) and (5.17) I show the results of similar simulations with a 15 degree FWHM
Gaussian Gabor window. As expected the trend is that the distributions get more
and more Gaussian for higher multipoles and for bigger windows. For the 15
FWHM window, the distribution is very close to a Gaussian for the multipoles
above = 50.
Figure 5.14: The probability distribution of

C
E
taken from 10000 simulations

with a 5
FWHM Gaussian Gabor window truncated at

C
= 3. The variable
x is given as x = (

C
E
<

C
E
>)/
_
< (

C
E
<

C
E
>)
2
>. The dashed line is
a Gaussian with the theoretical mean and standard deviation of the

C
E
. The
plot shows the

C
E
distribution for = 50, = 200, = 500, and = 800. The

probabilities are normalised such that the integral over x is 1.
Figure 5.15: Same as gure (5.14) for the temperature-polarisation cross spectra
C
C
.
Figure 5.16: Same as gure (5.14) for a 15

Figure 5.17: Same as gure (5.15) for a 15

Figure (5.18) and (5.19) show the probability distribution for a tophat win-
dow covering the same area on the sky as the Gaussian window used in gure
(5.16) and (5.17). Also this distribution is very close to a Gaussian.
Figure 5.18: Same as gure (5.16) for a tophat window covering the same area
on the sky
Figure 5.19: Same as gure (5.17) for a tophat window covering the same area
on the sky
The previous plots have shown that a Gaussian likelihood ansatz for the po-
larisation pseudo spectra seems to be a very good approximation provided that
the window is big enough. As for the temperature spectrum, the approximation
is no longer valid for the lowest multipoles, but as was shown for the temperature
power spectrum, this might only give rise to a very small downward bias for the
estimates of the lowest multipoles.
The form of the log-likelihood to minimize is therefore still
L = d
T
M
1
d + ln det M, (5.95)
where the datavector now consists of the temperature and polarisation power
spectra d = d
T
, d
E
, d
C
. Here the d
Z
vectors are given as
d
Z
i
=

C
Z
i
<

C
Z
i
>, (5.96)
where Z = T, E, B. Similarly the correlation matrix M will consist of blocks
M
ZZ
dened as
M
ZZ
,ij
=<

C
Z
C
Z
j
> <

C
Z
i
><

C
Z
j
> . (5.97)
This structure of the datavector and correlation matrix is shown in gure (5.20).
T
E
C
<TT>
<EE>
<CC>
<TC> <TE>
<EC> <TE>
<TC> <EC>
Figure 5.20: The gure shows the structure of the datavector d on the left hand
side and the correlation matrix M on the right hand side used for joint likelihood
estimation of temperature and polarisation power spectra.
For fast likelihood estimation, it is crucial that one can calculate the aver-
age pseudo spectra <

C
Z
> and correlation matrix M fast. The formalism in

the previous chapter which enabled fast calculations of these quantities for the
temperature power spectrum will now be extended to polarisation.
5.2.2 The Polarisation Correlation Matrix
To nd the correlation matrix M for likelihood estimation of the polarisation
power spectra one needs the formulae given in equations (5.41) to (5.48). As
shown there the correlations of the pseudo a
m
coecients can be written in terms
of the h(,
, m) function from chapter (4) and the H

2
(,
, m) and H
2
(,
, m)
functions. These function can be quickly calculated using the important recur-
sion formulae deduced in appendix (D) and (E). The starting points of these
recursions can also be quickly provided using summations and FFT as explained
in chapter (4). I will now show that the correlation function M can be expressed
in terms of these functions and for this reason can be calculated quickly.
The pseudo power spectra can be written as
<

C
T
> =
m
< a
T
m
a
T
m
>
2 + 1
, (5.98)
<

C
E
> =
m
< a
E
m
a
E
m
>
2 + 1
, (5.99)
<

C
C
> =
m
< a
T
m
a
E
m
>
2 + 1
. (5.100)
To nd the correlation function between

C
for polarisation one can follow

the same steps as for the temperature correlation functions, and get,
M
EE,
=
2
(2 + 1)(2
+ 1)
m
< a
E,m
a
E,
m
>
2
, (5.101)
M
BB,
=
2
(2 + 1)(2
+ 1)
m
< a
B,m
a
B,
m
>
2
, (5.102)
M
CC,
=
1
(2 + 1)(2
+ 1)
m
_
< a
E,m
a
E,
m
>< a
m
a
m
> (5.103)
+ < a
E,m
a
m
>< a
E,
m
a
m
>
_
, (5.104)
M
ET,
=
2
(2 + 1)(2
+ 1)
m
< a
E,m
a
m
>
2
, (5.105)
M
BT,
=
2
(2 + 1)(2
+ 1)
m
[ < a
B,m
a
m
> [
2
, (5.106)
M
CT,
=
2
(2 + 1)(2
+ 1)
m
< a
E,m
a
m
>< a
m
a
m
>, (5.107)
M
EB,
=
2
(2 + 1)(2
+ 1)
m
[ < a
E,m
a
B,
m
> [
2
, (5.108)
M
EC,
=
2
(2 + 1)(2
+ 1)
m
< a
E,m
a
E,
m
>< a
E,m
a
m
>, (5.109)
M
BC,
=
2
(2 + 1)(2
+ 1)
m
< a
E,
m
a
B,m
>< a
m
a
B,m
>, (5.110)
(5.111)
where the correlation between a
m
are given in equations (5.41) to (5.48) as sums
of h(,
, m) and H
2
(,
, m).
In gure (5.21) I have plotted the correlation matrix M
TT
next to the matrix

M
EE
. A standard CDM power spectrum without B mode polarisation was used.

The two matrices are very similar. One big dierence is that the matrix for E
mode polarisation is missing the wall at low multipoles present in the tempera-
ture matrix. As discussed before this is because of the dierent shapes for the T
and E power spectra at low multipoles. The temperature power spectrum drops
steeply at low while this is not the case for the E mode polarisation spectrum.
Figure 5.21: The correlation matrices M(,
) in the gure show the correlations

between the temperature pseudo power spectrum coecients and between the E
mode polarisation pseudo spectrum coecients for a 15 degree FWHM Gaussian
Gabor window. The left plot shows (<

C
T
C
T
> <

C
T
><

C
T
>)/(<

C
T
><
C
T
>) and the right plot shows (<

C
E
C
E
> <

C
E
><

C
E
>)/(<

C
E
><
C
E
>). A standard CDM power spectrum was used to produce the plots.
In gures (5.22) and (5.23) the M
CC
, M
TC
, M
TE
and M
EC
matrices are
shown. All matrices are diagonally dominant and the values on the diagonals are
not diering signicantly between the matrices. For this reason all the matrices
have to be computed and used in likelihood analysis.

between the cross-correlation pseudo power spectrum C coecients and between
the temperature and cross correlation pseudo spectrum coecients C for a 15
degree FWHM Gaussian Gabor window. The left plot shows (<

C
C
C
C
> <
C
C
><

C
C
>)/(<

C
C
><

C
C
>) and the right plot shows (<

C
T
C
C
> <
C
T
><

C
C
>)/(<

C
T
><

C
C
>). A standard CDM power spectrum was used

to produce the plots.

between the temperature and E mode polarisation pseudo spectrum coecients
and between the E mode polarisation and cross correlation pseudo spectrum
coecients C for a 15 degree FWHM Gaussian Gabor window. The left plot
shows (<

C
T
C
E
> <

C
T
><

C
E
>)/(<

C
T
><

C
E
>) and the right plot

shows (<

C
E
C
C
> <

C
E
><

C
C
>)/(<

C
E
><

C
C
>). A standard CDM

power spectrum was used to produce the plots.
5.2.3 Polarisation with Noise
Analogously to chapter (4) I will now discuss the noise pseudo power spectra and
the noise correlation matrix for polarisation. Each pixel in the temperature map
has a noise temperature n
j
and for the polarisation maps I assume n
Q
j
and n
U
j
to
have the following properties,
< n
j
>= 0, < n
j
n
j
>=
jj
(
T
j
)
2
, (5.112)
< n
Q
j
>= 0, < n
Q
j
n
Q
j
>=
jj
(
P
j
)
2
, (5.113)
< n
U
i
>= 0, < n
U
i
n
U
j
>=
ij
(
P
j
)
2
, (5.114)
I also assume that there is no correlation between noise in the dierent maps T,Q
and U. For the full sky one has,
< a
N
m
a
N
m
> =
j
(
T
j
)
2
[Y
j
m
[
2
(5.115)
< a
N
2,m
a
N
2,m
> = 2
j
(
P
j
)
2
[
2
Y
j
m
[
2
, (5.116)
< a
N
2,m
a
N
2,m
> = 2
j
(
P
j
)
2
[
2
Y
j
m
[
2
, (5.117)
< a
N
2,m
a
N
2,m
> = 0, (5.118)
< a
N
E,m
a
N
E,m
> = 2
j
(
P
j
)
2
([
2
Y
j
m
[
2
+[
2
Y
j
m
[
2
[), (5.119)
< a
N
B,m
a
N
B,m
> = < a
N
E,m
a
N
E,m
>, (5.120)
< a
N
E,m
a
N
B,m
> = 0, (5.121)
which for this type of noise gives C
EN
= C
BN
.
The pseudo a
2,m
coecients can now be found using equations (5.1) and (5.2)
I dene
a
N
2,m
=
j
(n
Q
j
in
U
j
)G
j 2
Y
j
m
, (5.122)
for an axissymmetric Gabor window G having the value G
j
in pixel j. The E
and B components are then similarly
a
N
E,m
=
1
2
j
_
n
Q
j
_
2
Y
j
m
+
2
Y
j
m
_
+in
U
j
_
2
Y
j
m
2
Y
j
m
_
_
G
j
,(5.123)
a
N
B,m
=
1
2
i
j
_
n
Q
j
_
2
Y
j
m
2
Y
j
m
_
+in
U
j
_
2
Y
j
m
+
2
Y
j
m
_
_
G
j
. (5.124)
(5.125)
The correlations between these coecients are
< a
N
E,m
a
N
E,
m
> =
1
4
j
_
(
P
j
)
2
_
2
Y
j
m
+
2
Y
j
m
_
(5.126)
_
2
Y
j
m
+
2
Y
j
_
(5.127)
+(
P
j
)
2
_
2
Y
j
m
2
Y
j
m
_
(5.128)
_
2
Y
j
m

2
Y
j
_
_
G
2
j
(5.129)
=
1
2
[h
2
(,
, m, m
) (5.130)
+(1)
m+m
2
(,
, m, m
)]
H
2
(,
, m, m
), (5.131)
where the last line denes H
2
(,
, m, m
). The h
2
(,
, m, m
) is dened similar
to the h
(,
, m, m
) function in chapter (4)

h
2
(,
, m, m
) =
j
G
2
j
(
P
j
)
2
2
Y
j
m
2
Y
j
m
, (5.132)
Note the following relation which was used to obtain equation (5.130)
j
G
2
j
(
P
j
)
2
2
Y
j
m
2
Y
j
m
= (1)
m+m
h
2
(,
, m, m
). (5.133)
Again, one can see that when the Gabor window AND noise have azimuthal
symmetry this reduces simply to,
h
2
(,
, m, m
) = h
2
(,
, m). (5.134)
In a similar manner the other a
m
relations can be found
< a
N
B,m
a
N
B,
m
> = < a
N
E,m
a
N
E,
m
> (5.135)
< a
N
E,m
a
N
B,
m
> =
1
2
i[h
2
(,
, m, m
) (5.136)
(1)
m+m
2
(,
, m, m
)]
iH
2
(,
, m, m
), (5.137)
where the last line again denes H
2
(,
, m, m
). The H
2
(,
, m, m
) functions
which are needed to nd the noise correlation matrices can be quickly calculated
using the recursion in appendix (E).
Using these relations one can now nd the polarisation pseudo spectra
<

C
EN
>=<

C
BN
> =
1
2 + 1
m
< a
N
E,m
a
N
E,m
>, (5.138)
<

C
CN
> = 0 (5.139)
One can further use this to nd the noise correlation matrices M
N
ZZ
,
, dened
as,
M
N
ZZ
,
=< C
ZN
C
Z
> < C
ZN
>< C
Z
>, (5.140)
where Z = T, E, B, C. I nd,
M
N
TT,
=
2
2 + 1
mm
[ < a
N
m
a
N
m
> [
2
, (5.141)
M
N
EE,
=
2
2 + 1
mm
[ < a
N
E,m
a
N
E,
m
> [
2
, (5.142)
M
N
BB,
=
2
2 + 1
mm
[ < a
N
B,m
a
N
B,
m
> [
2
, (5.143)
M
N
CC,
=
1
2 + 1
mm
< a
N
E,m
a
N
E,
m
>< a
N
m
a
N
m
>, (5.144)
M
N
EB,
=
2
2 + 1
mm
[ < a
N
E,m
a
N
B,
m
> [
2
, (5.145)
(5.146)
all others combinations are zero. I then nd the total correlation matrix M
ZZ
consisting of both signal and noise. As for temperature, this is not simply the
sum of the correlation matrix for signal and noise, one also gets cross terms. The
nal result is
M
TT,
= M
S
TT,
+M
N
TT,
+ (5.147)
4
2 + 1
mm
< a
S
m
a
S
m
>< a
N
m
a
N
m
>, (5.148)
M
EE,
= M
S
EE,
+M
N
EE,
+ (5.149)
4
2 + 1
mm
< a
S
E,m
a
S
E,
m
>< a
N
E,m
a
N
E,
m
>, (5.150)
M
BB,
= M
S
BB,
+M
N
BB,
+ (5.151)
4
2 + 1
mm
< a
S
B,m
a
S
B,
m
>< a
N
B,m
a
N
B,
m
>, (5.152)
M
CC,
= M
S
CC,
+M
N
CC,
+ (5.153)
1
2 + 1
mm
_
< a
S
E,m
a
S
E,
m
>< a
N
m
a
N
m
> (5.154)
+ < a
S
m
a
S
m
>< a
N
E,m
a
N
E,
m
>
_
, (5.155)
M
TE,
= M
S
TE,
(5.156)
M
TB,
= M
S
TB,
(5.157)
M
TC,
= M
S
TC,
+
2
2 + 1
mm
< a
S
m
a
S
E,
m
>< a
N
m
a
N
m
>, (5.158)
M
EB,
= M
S
EB,
+M
N
EB,
+ (5.159)
4
2 + 1
mm
< a
S
E,m
a
S
B,
m
>< a
N
E,m
a
N
B,
m
>, (5.160)
M
EC,
= M
S
EC,
+
2
2 + 1
mm
< a
S
E,m
a
S
m
>< a
N
E,m
a
N
E,
m
>, (5.161)
M
BC,
= M
S
BC,
+
2
2 + 1
mm
< a
S
B,m
a
S
m
>< a
N
B,m
a
N
E,
m
>, (5.162)
(5.163)
5.3 Results of Likelihood Estimations 186
5.3 Results of Likelihood Estimations
The likelihood estimation was carried out in the same way as for the temperature
power spectrum. The power spectra were estimated in bins dened as
C
T
=
D
T
b
( + 1)
,
b
<
b+1
, (5.164)
C
E
=
D
E
b
( + 1)
,
b
<
b+1
. (5.165)
(5.166)
A similar binning does not work for for the temperature-polarisation cross cor-
relation power spectrum. The reason for this is the Schwarz inequality C
C

_
C
T
C
E
. During likelihood maximization one must make sure that the esti-
mated value of C
C
never exceeds
_
C
T
C
E
. The way I solved this problem was

to estimate for C
C
/
_
C
T
C
E
under the constraint that this value never exceeds

1. So the binning is then
C
C
= D
C
b
_
C
T
C
E
(5.167)
=
D
C
b
_
D
T
b
D
E
b
( + 1)
, (5.168)
where as before
b
<
b+1
.
As an example I simulated a sky using N
side
= 512 resolution in Healpix and
a 10
beam. I added non-uniform noise to the map. A reasonable assumption

about the size of the noise deviations for polarisation is to take
P
j
=
2
T
j
(Zaldarriaga and Seljak 1997). This is what I used. The noise level was set so
that the signal to noise ration for the temperature power spectrum was always
well above 1 below the maximum multipole = 1024 whereas for the E mode
polarisation power spectrum it was mostly below 1 (see gure (5.25)). This is
close to the values expected for the Planck 143GHz channel. For the analysis I
used a 15 degree Gaussian Gabor window. The result of one single estimation is
shown in gure (5.25).
To test whether the method is bias or not, I did 60 Monte Carlo simulations.
The result of the average of these simulations is shown in gure (5.24). The
method seems to be unbiased also for the estimates of the polarisation power
spectra. Note that the expected noise variance taken from the approximate an-
alytic formula for uniform noise (shaded areas on the plot) here fails to predict
the size of the error bars on the estimates. The expected variance taken from the
inverse Fisher matrix (dashed lines) ts better with the error bars from Monte
5.4 Discussion 187
Carlo. The reason is that I as in chapter (4) used a noise prole with increasing
noise from the centre of the disc and down to the edges, opposite of the Gaussian
window.
5.4 Discussion
An extension of the power spectrum estimation method developed in chapter (4)
has been made in order to estimate for polarisation power spectra. The method
has been tested here under the assumption that the B mode polarisation is negli-
gible. In this case the method appears to give unbiased estimates of the polarisa-
tion power spectra also in the presence of non-uniform noise and a Gabor window.
The kernels connecting the full sky polarisation power spectra with the cut
sky polarisation pseudo power spectra were studied and found to be very similar
to the kernel for the temperature power spectrum. For this reason the eect of
a cut sky and a Gabor window on the polarisation power spectra is similar to
the eect on the temperature power spectrum. This explains that the method of
estimating the power spectrum from the pseudo power spectrum also worked for
polarisation.
One issue which has not been studied here is the inclusion of the B mode
polarisation. I demonstrated that the E and B mode polarisation power spectra
are mixing on the cut sky making detections of the much weaker B component
dicult. Further work needs to be done in order to include the B component in
the likelihood analysis.
5.4 Discussion 188
Figure 5.24: The result of a joint likelihood estimation of the temperature power
spectrum (upper plot) and the E (middle plot) and C (lower plot) polarisation
power spectra. The dotted line shows the full sky average spectrum. The his-
togram shows the binned input pseudo spectrum without noise. The shaded
areas around the binned average full sky power spectrum (not shown) show the
expected deviations from the average using the approximate formula for uniform
noise. The bright shaded area shows the cosmic and sample variance only whereas
the dark shaded area also shows expected variance due to noise. The dots show
the estimate with 1 error bars taken from the inverse Fisher matrix. In the
analysis a 15 degree FWHM Gaussian Gabor window with a
C
= 3 cuto was
used.
5.4 Discussion 189
Figure 5.25: Same as gure (5.24) but the dots here are the average of 60 esti-
mates from Monte Carlo simulations. The error bars are the average deviations
taken from the simulations. The dotted line shows the average full sky spec-
trum. The shaded areas which are plotted around the binned full sky power
spectrum (not shown) show the variance taken from the approximate variance
formula for uniform noise. The dashed lines show the expected variance taken
from the inverse Fisher matrix.
5.4 Discussion 190
Appendix A
Rotation Matrices
A spherical function T( n) is rotated by the operator

D() where are the
three Euler angles for rotations (See T.Risbo 1996) and the inverse rotation is
D( ). For the spherical harmonic functions, this operator takes the

form,
Y
m
( n
) =
=
D
m
()Y
m
( n), (A.1)
where D
m
has the form
D
m
() = e
im
m
()e
im
. (A.2)
Here d
m
() is a real coeecient with the following property:
d
m
() = d
mm
(). (A.3)
The D-functions also have the following property:
D
m
() =
m
(
2
2
)D
m
(
1
1
), (A.4)
where () is the result of the two consecutive rotations (
1
1
) and (
2
2
).
The complex conjugate of the rotation matrices can be written as
D
mm
= (1)
m+m
(m)(m
)
. (A.5)
See also Appendix (B).
Appendix B
Spin-s Harmonics
The spherical harmonic functions Y
m
( n) can be generalized to spin-s harmonics
using the rotation matrices in Appendix (A). The general denition is
D
sm
(
2
, ,
1
) =
4
2 + 1
s
Y
m
(,
2
)e
is
1
, (B.1)
or in the form which will be mostly used in this thesis
s
Y
m
(, ) =
2 + 1
4
D
sm
(, , 0). (B.2)
The spin-s harmonics have the orthogonality and completeness relations given by
_
d n
s
Y
m
( n)
s
Y
m
( n) =

mm
(B.3)
m
s
Y
m
( n)
s
Y
m
( n
0
) = ( n n
0
). (B.4)
The complex conjugate of the spin harmonics can be written
s
Y
m
( n) = (1)
s+m
s
Y
(m)
( n). (B.5)
Appendix C
Some Wigner Symbol Relations
Throughout the paper, the Wigner 3j Symbols will be used frequently. Here are
some relations for these symbols, which are used. The orthogonalithy relation is,
mm
m m
__

m m
_
= (2
+ 1)
1
L

m
M
. (C.1)
The Wigner 3j Symbols can be represented as an integral of rotation matrices
(see Appendix(A)),
1
8
2
_
d cos ddD
m
1
m
1
D
m
2
m
2
D
m
3
m
3
=
_

m
1
m
2
m
3
__

1
m
2
m
3
_
.
(C.2)
This expression can be reduced to,
_
d nY
m
( n)Y
m
( n)Y
m
( n) =
(2 + 1)(2
+ 1)(2
+ 1)
4
_

m m
__

0 0 0
_
.
(C.3)
Appendix D
Recurrence Relation
It is important for the precalculations to the likelihood analysis that the calcu-
lation of h(,
, m) is fast. For this reason a recurrence relation for h(,
, m)
would be helpfull. To speed up the calculation of the noise correlation matrix for
non axisymmetric noise, it would also help if one had a more general recurrence
relation for h(,
, m, m
). I will now show how to nd such a recurrence for

these objects which I now call A
m
to simplify notation (and for the notation to

comply with (Wandelt, Hivon, and G orski 2000)). The denition is again,
A
m
=
_
b
a
G( n)Y
m
( n)Y
m
( n)d n, (D.1)
where G( n) = G(, ) is a general function and Y
m
are the spherical harmonics
which can be factorised into one part dependent on and one dependent on in
the following way,
Y
m
(, ) =
m
(cos )e
im
. (D.2)
Now writing,
A
m
=
_
d cos
m
(cos )
m
(cos )
_
de
i(mm
)
G(, ) (D.3)
_
d cos
m
(cos )
m
(cos )F
m
m
(), (D.4)
where F
m
m
() is simply the Fourier transform of the window at each . The
quantities A
m
and F
m
m
() are in general complex quantities obeying,
A
m
= (A
mm
F
m
m
() = (F
mm
())
(D.5)
The A
m
can be expressed as
A
m
=
_
(2
+ 1)(2 + 1)
2
_
(
)!( m)!
(
+m
)!( +m)!
I
m
, (D.6)
195
where I
m
is dened as:
I
m
=
_
b
a
F
m
m
(x)P
m
(x)P
m
(x)dx. (D.7)
The following relation for the Legendre Polynomials will be used:
xP
m
=
m+ 1
2 + 1
P
m
+1
+
+m
2 + 1
P
m
1
(D.8)
I now dene the object X
m
as
X
m
=
_
b
a
F
m
m
(x)xP
m
P
m
dx (D.9)
Using relation (D.8) in this denition, one gets,
X
m
=
m + 1
2 + 1
I
m
(+1)
+
+m
2 + 1
I
m
(1)
(D.10)
One can also exchange (,
) and (m, m
) to get
X
mm
+ 1
2
+ 1
I
mm
+1)
+

+m
+ 1
I
mm
1)
(D.11)
Taking the complex conjugate of the rst expression and subtracting the last,
one has
(X
m
X
mm
= 0 =
m+ 1
2 + 1
(I
m
(+1)
)
+
+m
2 + 1
(I
m
(1)
)
(D.12)
m+ 1
2
+ 1
I
mm
+1)
+m
2
+ 1
I
mm
1)
(D.13)
Then setting
1 one gets:
I
m
=
2
_
m+ 1
2 + 1
I
m
m
(
1)(+1)
+
+m
2 + 1
I
m
m
(
1)(1)
+m1
2
1
I
m
m
(
2)
_
(D.14)
Using equation (D.6), one can express this as
A
m
=
1
_
2
m
2
_
_
(4
2
1)(( + 1)
2
m
2
)
(2 + 1)(2 + 3)
A
m
m
(
1)(+1)
(D.15)
+
(4
2
1)(
2
m
2
)
4
2
1
A
m
m
(
1)(1)
(2
+ 1)((
1)
2
m
2
)
2
3
A
m
m
(
2)
_
,
which is the nal recurrence relation. The A
m
m
m
elements must be provided before

the recurrence is started. Then for each (m, m
), set
= m
+1 and let go from

196
and upwards, then set
= m
+ 2 and again let go from
and upwards.
Continue to the desired size of
. Note that, in order to get all objects up to

A
m
m
maxmax
one need to go up to = 2
max
all the time during recursion. This
is because of the A
m
m
(
1)(+1)
term which all the time demands an oject indexed
( + 1) in the previous
row.
To start the recurrence, one can precompute the A
m
m
m
factors fast and easily

using FFT and a sum over rings on the grid. F.ex. for the HEALPix grid, I did
it the following way,
A
m
m
m
r
m
m

r
m
Nr1
j=0
e
2ij/Nr(mm
)
G
rj
, (D.16)
where the last part is the Fourier transform of the Gabor window, calculated by
FFT, r is ring number on the grid and j is azimuthal position on each ring. Ring
r has N
r
pixels.
It turns out that the recurrence can be numerically unstable dependent on the
window and multipole, and in order to avoid problems I (using double precision
numbers) restart the recurrence with a new set of precomputed A
m
for every
50th
row. However for some windows and multipoles the recurrence can run
for hundreds of -rows without problems.
Appendix E
Extention of the Recurrence
Relation to Polarisation
For fast calculations of correlation matrices for polarisation, it would be pleasant
to have a recurrence relation for
h
2
(,
, m, m
)
_
d nG( n)
2
Y
m
( n)
2
Y
m
( n), (E.1)
similar to the one for h(,
, m, m
) in Appendix (D). Again I simplify the no-

tation by calling the function A
m
. Separating the spin-2 harmonic one can
write
2
Y
m
(, ) =
2
m
(cos )e
im
. (E.2)
In this way one can write A
m
as
A
m
=
_
dx
2
m
(x)
2
m
(x)
_
2
0
dG(, )e
i(mm
)
. .
F
mm
(x)
, (E.3)
where x = cos . As before I dene
X
m
=
_
dxx
2
m
(x)
2
m
(x)F
mm
(x), (E.4)
where obviously F
mm
(x)
= F
m
m
(x) and therefore X
m
= (X
m
m
)
. The next
step is to use a recurrence relation for spin-2 harmonics
x
2
m
(x) = p
m 2
(+1)m
(x) q
m 2
m
(x) +p
(1)m 2
(1)m
(x), (E.5)
where
p
m
=
1
+ 1
_
(( + 1)
2
m
2
)(( + 1)
2
4)
4( + 1)
2
1
, (E.6)
q
m
=
2m
( + 1)
. (E.7)
198
In this way one has
X
m
= p
m
A
(+1)m
q
m
A
m
+p
(1)m
A
(1)m
, (E.8)
X
m
m
= p
m
A
m
(
+1)m
q
m
A
m
m
+p
(
1)m
A
m
(
1)m
, (E.9)
Subtracting the complex conjugate of equation (E.9) from equation (E.8) the left
side is zero and one is left with
A
m
=
1
p
(
1)m
_
p
m
A
(
1)m
(+1)m
+ (q
(
1)m
q
m
)A
(
1)m
m
(E.10)
+p
(1)m
A
(
1)m
(1)m
p
(
2)m
A
(
2)m
m
_
. (E.11)
This is the recursion formula. All the A
m
can be found using the same scheme
as for the scalar harmonics described in appendix (D). This equation togehter
with equation (D.15) are two of the main results in the thesis. These formulae
allow fast calculations of the couplings between scalar and tensor harmonics on
a cut sphere.
Bibliography
Baccigalupi, C. et al. (2000). Neural networks and the separation of cosmic
microwave background and astrophysical signals in sky maps. MNRAS 318,
769.
Bardeen, J. M. (1980). Gauge-invariant cosmological pertubations. Phys. Rev.
D 22, 1882.
Bartlett, J. G., M. Douspis, A. Blanchard, and M. L. Dour (2000). An ap-
proximation to the likelihood function for band-power estimates of cmb
anisotropies. A&ASS 146, 507.
Bennet, C. L. et al. (1996). Four-year cobe dmr cosmic microwave background
observations: Maps and basic results. ApJ 464, L1.
Bennett, C. L. et al. (1992). Preliminary separation of galactic and cosmic mi-
crowave emission for the cobe dierential microwave radiometer. ApJ 396,
L7.
Bennett, C. L. et al. (1994). Cosmic temperature uctuations from two years
of cobe dierential microwave radiometers observations. ApJ 436, 423.
Benoit, A. et al. (2001). Archeops: A high resolution, large sky coverage balloon
experiment for mapping cmb anisotropies. astro-ph/0106152.
Bernardis, P. D. et al. (2000). A at universe from high-resolution maps of the
cosmic microwave background radiation. Nature 404, 955.
Bersanelli, M. et al. (1996). Cobras/samba: report on the phase a study.
Bond, J. R., G. Efstathiou, and M. Tegmark (1997). Forecasting cosmic pa-
rameter errors from microwave background anisotropy experiments. MN-
RAS 291, L33.
Bond, J. R., A. H. Jae, and L. Knox (1998). Estimating the power spectrum
of the cosmic microwave background. Phys.Rev.D 57, 2117.
Bond, J. R., A. H. Jae, and L. Knox (2000). Radical compression of cosmic
microwave background data. ApJ 533, 19.
Bouchet, F. R. and R. Gispert (1999). Foregrounds and cmb experiments i.
semi-analytical estimates of contamination. NewA 4, 443.
BIBLIOGRAPHY 200
Chandrasekhar, S. (1960). Radiative Transfer. Dover, New York.
Chiueh, T. and C. Ma (2001). The annulus-ltered e and b modes in the cmbr
polarisation. astro-ph/0101205.
Coles, P. and F. Lucchin (1995). Cosmology. John Wiley & Sons.
Delabrouille, J. (1998). Analysis of the accuracy of a destriping method for
future cosmic microwave background mapping with the planck surveyor
satellite. A&ASS 127, 555.
Dore, O. et al.
Dore, O., L. Knox, and A. Peel (2001). Cmb power spectrum estimation via
hierarchical decomposition. astro-ph/0104443.
Ferreira, P. G. and A. H. Jae (2000). Simultaneous estimation of noise and
signal in the cosmic microwave background experiments. MNRAS 312, 89.
Gabor, D. (1946). J. Inst. Elect. Eng. 93, 429.
G orski et al. Homepage: http://www.eso.org/science/healpix/.
G orski, K. M. (1994). On determining the spectrum of primordial inhomogene-
ity from the cobe dmr sky maps: method. ApJ 430, L85.
G orski, K. M. et al. (1996). Power spectrum of primordial inhomogeneity de-
termined from the four-year cobe dmr sky maps. ApJ 464, L11.
G orski, K. M. and F. K. Hansen (2002). Cmb power spectrum estimation by
combining multiple patches on the sky. in prep..
Hanany, S. et al. (2000). Maxima-1: A measurement of the cosmic microwave
background anisotropy on angular scales of 10
-5
. ApJ 545, L5.

Haslam, C. G. T. et al. (1981). A 408 mhz all-sky continuum survey. i - obser-
vations at southern declinations and for the north polar region. A&A 100,
209.
Hivon, E. et al. (2001). Master of the cmb anisotropy power spectrum: A
fast method for statistical analysis of large and complex cmb data sets.
astro-ph/0105302.
Hobson, M. P. et al. (1999). The eect of point sources on satellite observations
of the cosmic microwave background. MNRAS 306, 232.
Hobson, M. P., A. W. Jones, A. N. Lasenby, and F. R. Bouchet (1998).
Foreground separation methods for satellite observations of the cosmic mi-
crowave background. MNRAS 300, 1.
Hobson, M. P. and J. Magueijo (1996). MNRAS 283, 1133.
Hu, W. and N. Sugiyama (1995). Anisotropies in the cosmic microwave back-
ground. ApJ 444, 489.
BIBLIOGRAPHY 201
Jae, A. H. et al. (2001). Cosmology from maxima-1, boomerang, and cobe
dmr cosmic microwave background observations. Phys. Rev. Lett 86, 3475.
Jungman, G., M. Kamionkowski, A. Kosowsky, and D. N. Spergel (1996).
Cosmological-paramater determination with microwave background maps.
Phys. Rev. D 54, 1332.
Kamionkowski, M., A. Kosowsky, and A. Stebbins (1997). Statistics of cosmic
microwave background polarization. Phys. Rev. D 55, 7368.
Kogut, A. et al. (1996). High-latitude galactic emission in the cobe dierential
microwave radiometer 2 year sky maps. ApJ 460, 1.
Kolb, E. W. and M. S. Turner (1990). The Early Universe. Addison-Wesley
Publishing Company.
Lewis, A., A. Challinor, and N. Turok (2001). Analysis of cmb polarisation on
an incomplete sky. astro-ph/0106536.
Ma, C. P. and E. Bertschinger (1995). Cosmological pertubation theory in the
synchronous and conformal newtonian gauges. ApJ 455, 7.
Maino, D. et al. (1999). The planck-l instrument: Analysis of the 1/f noise
and implications for the scanning strategy. A&ASS 140, 383.
Maino, D. et al. (2001). All-sky astrophysical component separation with fast
independent component analysis (fastica). astro-ph/0108362.
Mauskopf, P. D. et al. (2000). Measurement of a peak in the cosmic mi-
crowave background power spectrum from the north american test ight
of boomerang. ApJ 536, L59.
Melchiorri, A. et al. (2000). Measurement of a peak in the cosmic mi-
crowave background power spectrum from the north american test ight
of boomerang. ApJ 536, L63.
Myers, S. T., J. E. Baker, A. C. S. Readhead, and E. M. Leitch (1997). Mea-
surements of the sunyaev-zeldovich eect in the nearby clusters a478, a2142,
and a2256. ApJ 485, 1.
Narlikar, J. V. and T. Padmanabhan (1991). Ination for astronomers.
ARAA 29, 325.
Natoli, P. et al. (2001). Non-iterative methods to estimate the in-ight noise
properties of cmb detectors. astro-ph/0110508.
Natoli, P., G. de Gasperis, C. Gheller, and N. Vittorio (2001). A map-making
algorithm for the planck surveyor. A&A 372, 346.
Nettereld, C. B. et al. (2000). A measurement by boomerang of multiple
peaks in the angular power spectrum of the cosmic microwave background.
astro-ph/0104460.
BIBLIOGRAPHY 202
Oh, S. P., D. N. Spergel, and G. Hinshaw (1999). An ecient technique to de-
termine the power spectrum from cosmic microwave background sky maps.
ApJ 510, 551.
Oliveira-Costa, A. D. et al. (1997). Galactic microwave emission at degree
angular scales. ApJ 482, 17.
Ostriker, J. P. and E. T. Vishniac (1986). Generation of microwave background
uctuations from nonlinear perturbations at the era of galaxy formation.
ApJ 306, L51.
Padmanabhan, T. (1999). Aspects of gravitational clustering. astro-
ph/9911374.
Parratt, L. G. (1961). Probability and experimental errors in science. John
Wiley and Sons, Inc.
Persi, F., D. N. Spergel, R. Cen, and J. P. Ostriker (1995). Hot gas in super-
clusters and microwave background distortions. ApJ 442, 1.
Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery (1992).
Numerical Recipes. Cambridge University Press.
Prunet, S. et al. (2001). Error estimation for the map experiment. A& A 373,
L13.
Prunet, S., C. B. Nettereld, E. Hivon, and B. P. Crill (2000). Iterative map-
making for scanning experiments. astro-ph/0006052.
Rees, M. J. and D. W. Sciama (1968). Nature 217, 511.
Reich, P. and W. Reich (1988). A map of spectral indices of the galactic radio
continuum emission between 408 mhz and 1420 mhz for the entire northern
sky. A&AS 74, 7.
Reynolds, R. J. (1984). A measurement of the hydrogen recombination rate in
the diuse interstellar medium. ApJ 282, 191.
Reynolds, R. J. (1992). The optical emission-line background and accompany-
ing emissions at ultraviolet, infrared, and millimeter wavelengths. ApJ 392,
L35.
Sachs, R. K. and A. M. Wolfe (1967). Pertubations of a cosmological mode and
angular variations of the microwave background. ApJ 147, 73.
Seljak, U. (1996). Gravitational lensing eect on cosmic microwave background
anisotropies: A power spectrum approach. ApJ 463, 1.
Seljak, U. and M. Zeldariagga (1996). A line of sight approach to cosmic mi-
crowave background anisotropies. ApJ 469, 437.
Silk, J. (1968). Cosmic black-body radiation and galaxy formation. ApJ 151,
459.
BIBLIOGRAPHY 203
Stolyarov, V., M. P. Hobson, M. A. J. Ashdown, and A. N. Lasenby (2001).
All-sky component separation for the planck mission. astro-ph/0105432.
Strohmer, T. (1997). Proc. SampTA - Sampling Theory and Applications,
Aveiro/Portugal , 297.
Sunyaev, R. A. and Y. B. Zeldovich (1980). The velocity of clusters of galaxies
relative to the microwave background. the possibility of its measurement.
MNRAS 190, 413.
Szapudi, I. et al. (2001). Fast cosmic microwave background analyses via cor-
relation functions. ApJ 548, L115.
Szapudi, I., S. Prunet, and S. Colombi (2001). Fast analysis of inhomogenous
megapixel cosmic microwave background maps. ApJ 561, L11.
Taylor, A. C. (2001). The very small array. astro-ph/0109343.
Tegmark, M. (1996). A method for extracting maximum resolution power spec-
tra from microwave sky maps. MNRAS 280, 299.
Tegmark, M. (1997a). How to make maps from the cosmic microwave back-
ground data without losing information. ApJ 480, L87.
Tegmark, M. (1997b). How to measure cmb power spectra without losing in-
formation. Phys.Rev.D 55, 5895.
Tegmark, M. et al. (1997). A high-resolution map of the cosmic microwave
background around the north celestial pole. ApJ 474, 77.
Tegmark, M. and G. Efstathiou (1996). A method for subtracting foregrounds
from multifrequency cmb sky maps. MNRAS 281, 1297.
Tegmark, M., A. N. Taylor, and A. F. Heavens (1997). Karhunen-loeve eigen-
value problems in cosmology: How should we tackle large data sets?
ApJ 480, 22.
Tenorio, L. (1999). Applications of wavelets to the analysis of cosmic microwave
background maps. MNRAS 310, 823.
Toolatti, L. et al. (1998). Extragalactic source counts and contributions to
the anisotropies of the cosmic microwave background: predictions for the
planck surveyor mission. MNRAS 297, 117.
Turner, M. S., M. White, and J. E. Lidsey (1993). Tensor pertubations in
inationary models as a probe of cosmology. Phys. Rev. D 48, 4613.
Vishniac, E. T. (1987). Reionization and small-scale uctuations in the mi-
crowave background. ApJ 322, 597.
Wandelt, B. D. (2000). Advanced methods for cosmic microwave background
data analysis: the big n
3
and how to beat it. astro-ph/0012416.
Wandelt, B. D. and K. M. G orski (2000). Fast convolution on the sphere.
astro-ph/0008227.
Wandelt, B. D. and F. K. Hansen (2000). Fast, exact cmb power spectrum
estimation for a certain class of observational strategies. astro-ph/0106515.
Wandelt, B. D., E. Hivon, and K. M. G orski (2000). The pseudo-c
method:
cosmic microwave background anisotropy power spectrum statistics for high
precision cosmology. astro-ph/0008111.
Watson, G. S. (2000). An exposition on inationary cosmology. astro-
ph/0005003.
Wright, E. L. (1996). Scanning and mapping strategies for cmb experiments.
proceeding of the IAS CMB Workshop astro-ph/9612006.
Zaldariagga, M. and D. D. Harari (1995). Analytic approach to the polarization
of the cosmic microwave background in at and open universes. Phys. Rev.
D 52, 3276.
Zaldarriaga, M. and U. Seljak (1997). An all-sky analysis of the polarisation
in the microwave background. Phys. Rev. D 55, 1830.
Zeldovich, Y. B. and R. A. Sunyaev (1969). Ap&SS 4, 301.
Acknowledgements
A big thanks to Kris G orski for excellent supervision of my Ph.D. thesis, for al-
ways motivating me and giving suggestions to overcome problems, and for always
being available for questions and discussions during these 3 years.
A big thanks also goes to Ben Wandelt for supervising me in a part of the thesis,
for being a great teacher in CMB analysis and all other parts of physics, for giving
me a lot of inspiration and for all the long nights of fruitful discussions.
Thanks to Simon White for making it possible for me to make my Ph.D. work at
the Max Planck Institute for Astrophysics.
Thanks to Anthony Banday and Eric Hivon for discussions and helpful sugges-
tions.
Cristina: Thanks a lot for your patience during the last weeks of writing my
Ph.D. I had far too little time for you.... And thanks for all support, energy and
motivation.
To my parents: Thanks for always supporting me and for always accepting my
decisions.
To all my friends in Munich: Thanks a lot for making Munich such a nice place
to live.
To all my collegues at MPA: Thanks for making MPA a great place to work.
And nally I want to thank the Norwegian Research Council (Norges Forskn-
ingsrad) for awarding me a Ph.D. grant without which this work would have
been impossible.
Lebenslauf
Pers onliche Daten
Name: Frode Kristian Hansen
Geboren am: 18.04.1974 in Tnsberg, Norwegen.
Staatsangeh origkeit: Norwegisch.
Familienstand: ledig
Jetzige Adresse:
Christoph-Probst-Strasse 16-727,
80805 M unchen.
Email:[email protected]
Hochschulausbildung
seit Okt.1999 Doktorand am Max-Planck-Institut f ur
Astrophysik, Garching bei M unchen.
Jan.1999-Sep.1999 Doktorand am Theoretical Astro-
physics Center (TAC), Kopenhagen.
Jan.1999-Dec.2001 Promotionsstipendium vom Norwegis-
chen Forschungsrat. Doktorarbeit be-
treut von Prof. Kris G orski.
Jan.1997-Nov.1998 Diplomarbeit in Astrophysik betreut
von Prof. Per Lilje.
Aug.1993-Dec.1996 Mathematik-, Physik- und As-
trophysikkurse mit Examen jedes
Semester.
Aug.1993-Nov.1998 Astrophysikstudium an der Universit at
in Oslo.
Schulausbildung
Aug.1990-Jun.1993 Tnsberg Gymnasium (Norwegen),
Linie f ur Naturwissenschaft.
Aug.1981-Jun.1990 Grundschule in Tnsberg (Norwegen)

Frode Kristian Hansen - Data Analysis of The Cosmic Microwave Background

Uploaded by

Copyright:

Available Formats

Frode Kristian Hansen - Data Analysis of The Cosmic Microwave Background

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Frode Kristian Hansen - Data Analysis of The Cosmic Microwave Background

Uploaded by

Copyright:

Available Formats

Data Analysis

is the Ricci tensor, R is the Ricci scalar, is the cosmological constant,

is the metric, G is the gravitational constant and T

is diagonal and takes the form (I have set c = 1 for

= diag(, p, p, p). (1.3)

0.4). The equation of state of

(t). As I now describe a region

in this closed universe with Hubble constant H

and density parameter

can be represented by a parameterized set of equations,

() is the Legendre Polynomial of order . Using these denitions, the

is given again by the continuity and Euler equations

). The universe is now supposed to consist of 3 species, the

. These kind of uctuations are predicted

1.4 The Recombination Era and the CMB 31

(x) is the Bessel function, =

contain all the

) >= C() where = n n

. This gives that the a

>. One sees from equation (1.79)

, there are 2 + 1 independent statistical samples

>. This means that for the higher multipoles, the

for a given sky will be closer to the ensemble average

depends on from the Sahcs-Wolfe eect.

. In gure (1.3) I have plotted the power spectrum

where , = 0, 1, 2, 3 the components

inclination. The spacecraft was always

belt around the equator of the map in galactic coordinates),

was scanned and maps with 14

pixels have been made.

angle about the

and observing 30% of the sky every day. It takes 6

away (depending on the channel). This large distance is

is expected to be about 35K for each channel which means that

away from the

. All the information about

FWHM beam, the same as COBE.

are binned together.

FWHM beam. Other balloon borne experiments have measured the

emission (Reynolds 1984;

where is between 2.7 and 3.1. Observa-

FWHM respectively. There is no data available at

for which one wants to nd

one can now calculate the log-likelihood

. The set of parameters

which minimizes equation (2.37) is the ML estimate.

] means that the function is supposed to be taken at the minimum

for all . The derivatives of the log-likelihood are

) is found using the fact that

accounts for the beam, ^

is the noise power spectrum dened in the

is the measured power spectrum. As expected this is not

will have to be changed. Unfortunately the calculation of x

) were approximated to be Gaussian

is the Fisher matrix. As this is a Taylor expansion about the maximum-

and the full sky C

was derived and inverted giving an equation for C

s are only weekly dependent on each other. This makes