2021 - Week - 3 - Ch.2 Random Process
2021 - Week - 3 - Ch.2 Random Process
2021 - Week - 3 - Ch.2 Random Process
F ( x 1 , x 2 , … , x n ) =P ( X 1 ≤ x 1 , X 2 ≤ x 2 , … , X n ≤ x n )
Marginal Probability
F ( x 1 , x 2 , … , x k ) =P( X 1 ≤ x 1 , X 2 ≤ x 2 , X k ≤ x k , ∞ , … , ∞)
Given a group of people, there are two experiments as measurement of temperature and
blood pulse rate. Let us denote ω as one person
P ( T H ∩ P H }=0.4 , P ( T L ∩ PH ) =0.2
P ( T H ∩ P L }=0.3 , P ( T L ∩ P L )=0.1
Then the marginal probabilities are
P ( T H )=P ( T H ∩ P H ) + P ( T H ∩ P L ) =0.4+0.3=0.7
And
P ( T L ) =P ( T L ∩ P H )+ P ( T H ∩ P L ) =0.2+0.1=0.3
{
f ( x , y )= 2, 0< x , 0< y , x + y <1
0 otherwise
1) Is it a PDF ( or CDF) ?
0.5
dx
dy
-0.5 ∞ ∞ 1
∫ ∫ f (x , y )dxdy =∫ ¿ ¿
-0.5 0 0.5 1 1.5
−∞ −∞ 0
∞ 1−x
f ( x )= ∫ f (x , y ) dy= ∫ 2 dy=2 ( 1−x ) , 0< x< 1
−∞ 0
3) Is ( X , Y ) independent? No since
2=f ( x , y ) ≠ f ( x ) f ( y )=4 (1−x)(1− y)
HA_2_1 :
{
f ( x , y )= 2 x + y=1 , 0< x <2 , 0< y <1
0 otherwise
%%%
f(x,y)
1
1) Is this a PDF? 2 x
%%%
Def 2.16. Two random variables X and Y are called independent if any event of the
form X ( ω ) ∈ A id independent of any event of the form Y ( ω ) ∈ A where A , B are
sets in Rn
Fact
P ( X ∈ A , Y ∈ B )=P ( X ∈ A ) P (Y ∈ B )
The joint probability distribution
F ( x , y ) =P ( X ≤ x ,Y ≤ y )=P ( X ≤ x ) P ( Y ≤ y )=F ( x ) F ( y )
The joint probability density function
∂2 ∂ ∂ ∂F ∂F
f XY ( x , y )= F ¿ X = x, Y = y¿ F ¿ X = x F ¿Y = y ¿ ¿X = x ¿
∂ x∂ y ∂x ∂ y ∂x ∂ y Y=y
¿ f X ( x) f Y ( y )
Where ¿ J ( y )∨¿ stands for the absolute value of the determinant of the matrix
[ ]
−1 −1
∂ g1 ∂ gn
⋯
∂ y1 ∂ y1
J ( y )= ⋮ ⋱ ⋮
−1 −1
∂ g1 ∂ gn
⋯
∂ yn ∂ yn Y=y
Def.
The mean
∞
E [ X ] = ∫ xf ( x ) dx (a)
−∞
E ( X k )=m ∀ k ,
( )
n n
1 1 1
E ( m k ) =E ∑
n k =1
X k = ∑ E ( X k )= ( nm ) =m(b)
n k=1 n
%%%Kim’s Comment
What is the difference between a) and b)? In order to use (a) , it is needed know the
probability density function, whereas in (b), not needed.
{
f ( x )= 1 ,0 ≤ x ≤ 1
0 , otherwise
∞ 1
1
Then E ( X ) = ∫ xf ( x ) dx=∫ xdx= 2
−∞ 0
Examp. 2.22 The expectation of the value of one roll of one die?
Properties
−∞
−∞
4) The variance
σ X = √ var (X )
2
This is a random variable. And the unbiased estimator of σ X
What is the estimator? Let X be a RV. I want to find a constant “C” as RV in some sense.
We may call C as an estimator of the RV X . So there may be many estimators as you like.
C=min E ( X −a )2
a
3) The mean of X is the minimum variance estimator / the least square error estimator.
Proof:
d
( E ( X−a )2 )= d ( E X 2 +a 2−2 aE ( X ) ) =2 a−2 E ( X )
da da
a=E( X) , which minimizes the (c).
0 3
The Variance is
1 1 1
var ( X )=E ( X ) −E ( X ) = − =
2 2
3 4 12
2.7 Characteristic Functions -skip
Lemma 2.27
1 n
d ϕX (υ )
jn
E [ X ]=
n
|ν=0
d υn
Prop.2.28 If X is a Gaussian random vector with mean, m, and covariance matrix P,
then its characteristic function is
( 1
ϕ X ( υ )=exp j υ T m− υT Pυ .
2 )
%%% Kim’s comment: correlation
%%%
Fact: Two Gaussian Random Vectors are uncorrelated if Cov ( X , Y ) is a diagonal matrix
Theorem 2.30. If X is a Gaussian random vector with mean m X , and covariance, P X , and if
Y =CX +V , where υ is a Gaussian random vector with zero mean and covariance, PV , then Y
is a Gaussian random vector with mean, C m , and covariance, C P X C + PV .
T
X
Theorem 2.30
A R.V X N (m x , P X ) , another R.V. V N ( 0 , PV ) and they are independent. Find mean and
covariance of Y =CX +V
%%% Kim’s comment: Characteristic function is difficult to remember. In the text book, using
the characteristic method. In this case we may apply basic theory.
Sol: Let’s apply the basic definition.
¿ E¿
¿ E¿
¿ E¿
Hence
PY =E [ ( Y −m y ) ( Y −m y ) ]=E ¿
T
T
¿ C P X C + PV
- In general, independency implies the uncorrelated, not vice versa
Sometimes, but most case in this course, we may deal with a random vector whose
components are random variables, i.e.
[ ]
cov ( x , x ) cov ( x , y ) cov ( x , z)
Cov ( X )= cov ( x , y) cov ( y , y) cov ( y , z )
cov (x , z) cov ( y , z) cov (z , z )
Where,
cov ( x , y )=E ¿
hence by definition
E¿
Therefore, the matrix Cov (X ) is a symmetric matrix, i.e.,
T
Cov ( X )=[ Cov ( X ) ]
The diagonal terms of the covariance matrix are variance of each random variable
%%%
[ ]
σx 0 0
PX= 0 σ y 0
0 0 σz
diag ( Λ X )=S P X ST
Any Gaussian Random vectors, we can find a transformed Random Vectors which is
uncorrelated (independent).
%%%
Theorem 2.31. Let X 1 , … , X n be i.i.d. random variables with finite mean and variance,
n
E [ X k ] =m< ∞ , E [ ( X k −m) ] =σ < ∞, and denote their sum as Y n ≔∑ X k. Then the
2 2
k=1
distribution of the normalized sum
Y n−E [ Y n ] Y n−nm
Zn≔ =
√ var ( Y n ) σ √n
- Remarks:
[ ]
1) See, the condition, E [ X k ] =m< ∞ , E ( X k −m ) =σ < ∞, that means
2 2
the mean and the variance is constant, but the experiment is many time
processing. For example,
a) A die, which is fair or not, you roll the same die many times. Then
Yn 1 n
the mean of the sum ( = ∑ X k) is a Gaussian if n−→ ∞.
n n k=1
2) Some RV has no mean, then it will not be applicable.
f (x , y)
f ( x| y )=
f ( y)
Remarks
E [ X ] =E [ E [ X|Y ] ]
%%% Kim’s comment
E [ X ] =E X [ X ] −→ need f X ( x)
I should say, this formula cannot emphasize too much! This very simple fact use diverse
applications, big data, machine learning, and dynamic system analysis. We should remember this.
%%%
Lemma 2.34.
Def. 2.36. A stochastic process is a family of random variables, X ( ω , t), indexed by a real
parameter t ∈ T and defined on a common probability space ( Ω, A , P).
A stochastic process (or random process) is a time varying random variable, i.e., for any fixed t ,
the process is a random variable.
%%%
Ex. 2.37
X ( ω , t )= A ( ω ) sin t , A ( ω ) ∈ U [−1,1 ]
Def. 2.38.
1) A stochastic process X (ω , t) is said to be continuous in probability at t if
A 1 = { ω : X ( ω , t ) ∈ K ∀ t ∈ T } , A 1= { ω : X ( ω , t ) ∈ K ∀ t ∈ S } ,
Def. 2.42. Let X be a random process defined on the time interval, T. Let
Def. 2.43 We say that a random process, X, is a Gaussian process if for every
finite collection, X t 1 , X t2 , … , X tn ,the corresponding density function,
f (x 1 , x 2 , … , x n)
is a Gaussian density function.
Def. 2.44 We say that a random process X is a Gaussian process if every finite
linear combination of the form
N
Y =∑ α j X (t j )
j=1
or, equivalently
F ( X |X
tn t1 ,… , X t n−1 )(
x n|x 1 ,… , x n−1 )=F ( X |X ) ( x n|x n−1) .
tn t n−1
1) Dynamics
x k+1=Φk x k + wk ( 2.36)
a) Noise
E [ w k ]=w k
T
E {( wk −w k ) ( wl−wl ) =W k δ kl
where
{
δ kl = 1 , k=l
0,k≠l
b) The states
E [ x 0 ] =x 0
c) The correlation
which implies
E [ ( x k − xk ) ( w j−w j ) ]=0 ∀ j ≥ k
T
The mean
The covariance
P K+1 =Φk Pk Φ Tk +W k