Characteristic
Characteristic
1 of 9
Course:
Theory of Probability I
Term:
Fall 2013
Instructor: Gordan Zitkovic
Lecture 8
Characteristic Functions
First properties
A characteristic function is simply the Fourier transform, in probabilistic language. Since we will be integrating complex-valued functions,
we define (both integrals on the right need to exist)
Z
f d =
< f d + i
= f d,
where < f and = f denote the real and the imaginary part of a function
f : R C. The reader will easily figure out which properties of the
integral transfer from the real case.
Definition 8.1. The characteristic function of a probability measure
on B(R) is the function : R C given by
(t) =
eitx (dx )
When we speak of the characteristic function X of a random variable X, we have the characteristic function X of its distribution X
in mind. Note, moreover, that
X (t) = E[eitX ].
While difficult to visualize, characteristic functions can be used to
learn a lot about the random variables they correspond to. We start
with some properties which follow directly from the definition:
Proposition 8.2. Let X, Y and { Xn }nN be a random variables.
1. X (0) = 1 and | X (t)| 1, for all t.
2. X (t) = X (t), where bar denotes complex conjugation.
3. X is uniformly continuous.
4. If X and Y are independent, then X +Y = X Y .
Last Updated: December 8, 2013
2 of 9
5. For all t1 < t2 < < tn , the matrix A = ( aij )1i,jn given by
a jk = X (t j tk ),
is Hermitian and positive semi-definite, i.e., A = A and T A 0, for
any Cn ,
D
Proof.
1. Immediate.
2. eitx = eitx .
R
3. We have | X (t) X (s)| = (eitx eisx ) (dx ) h(t s), where
iux
R iux
h(u) = e 1 (dx ). Since e 1 2, dominated convergence theorem implies that limu0 h(u) = 0, and, so, X is uniformly continuous.
4. Independence of X and Y implies the independence of exp(itX ) and
exp(itY ). Therefore,
X +Y (t) = E[eit(X +Y ) ] = E[eitX eitY ] = E[eitX ]E[eitY ] = X (t) Y (t).
5. The matrix A is Hermitian by 2. above. To see that it is positive
semidefinite, note that a jk = E[eit j X eitk X ], and so
!
!
n
n n
n
k eitk X
j k a jk = E j eitj X
j =1 k =1
j =1
k =1
= E[| j eit j X |2 ] 0.
j =1
Z
R
3 of 9
(( a, b)) + 21 ({ a, b}) =
eita eitb
(t) dt.
it
(8.1)
Z b
a
eity dy,
Z
[ T,T ][ a,b]
where
F ( a, b, T ) =
Z T ita
e
eitb
it
(t) dt.
Z Z
[ T,T ]
Set
Z T
f ( a, b, T ) =
1 it( a x )
it ( e
eit(b x) ) dt and K ( T, c) =
Z T
sin(ct)
0
dt,
and note that, since cos is an even and sin an odd function, we have
Z T
sin(( a x )t)
Z T
f ( a, b, T; x ) = 2
sin((b x )t)
t
dt
1
it
exp(it( a x )) dt
= 2K ( T; a x ) 2K ( T; b x ).
Since
K ( T; c) =
R
T
sin(ct)
ct d ( ct )
R cT
0
sin(s)
s
ds = K (cT; 1),
c>0
0,
c=0
K (|c| T; 1),
c < 0,
(8.2)
lim K ( T; c) =
2,
c > 0,
0,
c = 0,
2 ,
c < 0.
and so
lim f ( a, b, T; x ) =
0,
4 of 9
x [ a, b]c ,
x = a or x = b,
2,
a < x < b.
lim
1
T 2
= lim
=
1
2
f ( a, b, T; x ) (dx )
lim f ( a, b, T; x ) ( x )
Z
R
d
d
is a
Proof. Since is integrable and eitx = 1, f is well defined. For
a < b we have
Z b
a
b
1
f ( x ) dx =
eitx (t) dt dx
2 a R
Z b
Z
1
=
(t)
eitx dx dt
2 R
a
eita eitb
(t) dt
it
R
Z T ita
1
e
eitb
= lim
(t) dt
it
T 2 T
1
2
(8.2)
= (( a, b)) + 21 ({ a, b}),
by Theorem 8.3, where the use of Fubinis theorem above is justified by
the fact that the function (t, x ) 7 eitx (t) is integrable on [ a, b] R,
Last Updated: December 8, 2013
5 of 9
for all a < b. For a, b such that ({ a}) = ({b}) = 0, the equation
Rb
(8.2) implies that (( a, b)) = a f ( x ) dx. The claim now follows by the
-theorem.
Example 8.6. Here is a list of some common distributions and the
corresponding characteristic functions:
1. Continuous distributions.
Density f X ( x )
Name
Parameters
Uniform
a<b
1
b a
Symmetric Uniform
a>0
1
2a
Normal
R, > 0
Exponential
>0
Double Exponential
>0
Cauchy
R, > 0
1
22
eita eitb
it(b a)
1[ a,b] ( x )
sin( at)
at
1[ a,a] ( x )
exp(
( x )2
22
exp(x )1[0,) ( x )
1
2
exp( | x |)
(2 +( x )2 )
exp(it 21 2 t2 )
it
2
2 + t2
exp(it |t|)
2. Discrete distributions.
Name
Parameters
Distribution X ,
Dirac
cR
exp(itc)
Biased Coin-toss
p (0, 1)
p1 + (1 p)1
Geometric
p (0, 1)
nN0 pn (1 p)n
1 p
1eit p
10
Poisson
>0
n
n!
exp((eit 1))
nN0 e
n , n N0
3. A singular distribution.
11
Name
Cantor
t
eit/2
k =1 cos( 3k )
Tail behavior
We continue by describing several methods one can use to extract useful information about the tails of the underlying probability distribution from a characteristic function.
Proposition 8.7. Let X be a random variable. If E[| X |n ] < , then
dn
(t) exists for all t and
(dt)n X
dn
(dt)n
In particular
n
d
E[ X n ] = (i )n (dt
(0).
)n X
6 of 9
Proof. We give the proof in the case n = 1 and leave the general case
to the reader:
(h) (0)
h
h 0
lim
= lim
h 0 R
eihx 1
h
(dx ) =
eihx 1
h
R h 0
lim
(dx ) =
Z
R
ix (dx ),
where the passage of the limit under the integral sign is justified by
the dominated convergence theorem which, in turn, can be used since
Z
ihx
e 1
| x | (dx ) = E[| X |] < .
h | x | , and
R
Remark 8.8.
1. It can be shown that for n even, the existence of
dn
(dt)n
dn
(dt)n
Finer estimates of the tails of a probability distribution can be obtained by finer analysis of the behavior of around 0:
Proposition 8.9. Let be a probability measure on B(R) and let =
be its characteristic function. Then, for > 0 we have
([ 2 , 2 ]c )
(1 (t)) dt.
(1 (t)) dt =
1
2 E[
= 1 E[
Z
0
(1 eitX ) dt]
sin( x )
sin(X )
X ].
sin( x )
sup
|t|
| f (t) g(t)|
h(t)
< .
7 of 9
n
Z
(1 n (t)) dt
(1 (t)) dt .
n N
Additional Problems
Problem 8.5 (Atoms from the characteristic function). Let be a probability measure on B(R), and let = be its characteristic function.
R T ita
1
(t) dt.
1. Show that ({ a}) = limT 2T
T e
2. Show that if limt | (t)| = limt | (t)| = 0, then has no
atoms.
Last Updated: December 8, 2013
8 of 9
1
1
1
2 ( k 3 k2 log(k ) )
C
,
k2 log(k )
for k = 3, 4, . . . ,
cos(hk )1
lim 1
2
h0 h k3 k log(k)
= 0.
Then split the sum at k close to 2/h and use (and prove) the inequality |cos( x ) 1|
min( x2 /2, x ). Bounding sums by integrals may help, too.
tk Xk )].
k =1
We will also use the shortcut t for (t1 , . . . , tn ) and t X for the random
variable nk=1 tk Xk . Prove the following statements
1. Random variables X and Y are independent if and only if
(X,Y ) (t1 , t2 ) = X (t1 ) Y (t2 ) for all t1 , t2 R.
2. Random vectors X 1 and X 2 have the same distribution if and only
if random variables t X 1 and t X 2 have the same distribution for
all t Rn . (This fact is known as Walds device.)
9 of 9
5. Construct a random vector ( X, Y ) such that both X and Y are normally distributed, but that X = ( X, Y ) is not Gaussian.
6. Let X = ( X1 , X2 , . . . , Xn ) be a random vector consisting of n independent random variables with Xi N (0, 1). Let Rnn
be a given positive semi-definite symmetric matrix, and Rn
a given vector. Show that there exists an affine transformation
T : Rn Rn such that the random vector T ( X ) is Gaussian with
T ( X ) N (, ).
7. Find a necessary and sufficient condition on and such that
the converse of the previous problem holds true: For a Gaussian
random vector X N (, ), there exists an affine transformation
T : Rn Rn such that T ( X ) has independent components with the
N (0, 1)-distribution (i.e. T ( X ) N (0, yI ), where yI is the identity
matrix).
Problem 8.8 (Slutskys Theorem). Let X, Y, { Xn }nN and {Yn }nN be
random variables defined on the same probability space, such that
D
Xn X and Yn Y.
(8.3)
Show that
D
Hint: