s131 Reviewer 002
s131 Reviewer 002
s131 Reviewer 002
S131_Reviewer_002
Statistics 131 Parametric Statistical Inference
2nd Exam Reviewer-Summary
Some Notations
Definitions
Consistency
Let T 1 , T 2 , …, T n , … be a sequence of estimators of τ (θ) , where T n = T n (X ) is the same estimator
based on a r.s. of size n. The sequence {T n } is:
1
Sufficiency
n
r. s.
For X ∼ Be( p) of size n and p ∈ [0, 1] , ∑ X i is sufficient for p .
i =1
n
r. s.
For X ∼ Po( λ ) of size n and λ > 0 , ∑ X i is sufficient for λ .
i =1
Joint Sufficiency
f X ( x 1 , x 2 , … , x n | S 1 ( X ) = s 1 , S 2( X ) = s 2 , … , S r ( X ) = s r ) is independent of θ , ∀ s1 , s 2 , …, s r .
f X 1 , X 2 , …, X n ( x 1 , x 2 , … , x n ) = g [S ( X ) ; θ ]⋅ h( X )
f X 1 , X 2 , …, X n ( x 1 , x 2 , … , x n ) = g [S 1 (X ), S 2 ( X ), … , S r ( X ) ; θ ]⋅h( X )
2
Fisher-Neyman Factorizations
n
= ∏ p x (1 − p)1 − x I {0, 1}( xi )
i i
i=1
n
= p∑ (1 − p) ∑
x n− x i i
∏
i =1
I {0, 1}( x i )
h (X )
[ ]
n
g ∑ xi ; p
i=1
n
S n = ∑ X i is sufficient for p .
i =1
r. s.
2. X ∼ Po( λ ) , λ > 0
n
f X ( x) = ∏
i=1
p X (x i ) i
n
e λx −λ i
= ∏ I (x )
i=1 x i ! {0 , 1 , …} i
n
= e−n λ λ ∑ x i
∏ x1! I {0 , 1 , …}(x i )
i =1
[ ] h (X )
n
g ∑ xi ; λ
i=1
n
S n = ∑ X i is sufficient for p .
i =1
r. s.
3. X ∼ N ( μ , σ 2 ), μ ∈ ℝ , σ 2 > 0
n
f X ( x) = ∏
i=1
f X (x i ) i
n
= ∏ σ √12 π
i=1
n
exp −
1
{ 1
2σ 2
( xi − μ )2
} I (−∞ , ∞) ( x i )
{ 1
}
−
= ∏ (2 π σ 2 ) 2
exp −
2σ 2
( xi − μ )2 I (−∞ , ∞) ( x i )
i=1
3
{ }
n n n
2
− 1 2
= ( 2 π σ ) exp − 2
2σ 2
∑ (x i − μ ) ∏
i=1
I (−∞ , ∞) ( x i )
i =1
{ [ ]}
n n n
− 1
f X (x) = ( 2 π σ 2) 2 exp −
2σ
2 ∑ ( x i − X )2 + n( X − μ )2 ∏ I (−∞, ∞) ( x i)
i=1 i =1
{ }
n n
{ 1
} 1
−
2 2
f X (x) = exp −
2σ 2
n( X − μ ) (2 π σ ) 2 exp −
2σ 2
∑ ( X i − X )2 I (−∞ , ∞) (x i )
i=1
g [ X ; μ] h(X )
X is sufficient for μ .
{ }
n n n
2 1
−
2
f X (x) = ( 2 π σ ) exp −
2σ 2
2
∑ (x i − μ ) ∏
i=1
I (−∞ , ∞) ( x i )
i =1
h (X )
[ ]
n
g ∑ ( x i − μ )2 ; σ 2
i=1
∑ (X i − μ )2 is sufficient for σ 2 .
i =1
{ [ ]} ∏
n n n
2
− 1 2 2
f X (x) = ( 2 π σ ) 2 exp −
2σ 2
∑ ( x i − X ) + n( X − μ ) I (−∞, ∞) ( x i)
i=1 i =1
h(X )
[ ]
n
g ∑ ( X i − X )2 , X ; μ,σ2
i=1
n
X, ∑ ( X i − X )2 are jointly sufficient statistics for θ = ( μ , σ 2)'
i =1
Theorem
If {S 1 , S2 , …, Sr } is a set of jointly sufficient statistics, then any set of at least r 1 – 1 transformations
of {S 1 , S2 , …, Sr } is also jointly sufficient. In the case of a single sufficient statistic, if S = S (X ) is
sufficient, then any 1 – 1 transformation of S is also sufficient.
4
Examples:
r. s.
1. X ∼ Be( p), p ∈ [0, 1]
n
1
S n = ∑ X i is sufficient for p → X = S is also sufficient for p
i =1 n n
r. s.
2. X ∼ Po( λ ) , λ > 0
X is also sufficient for λ .
r. s.
3. X ∼ N ( μ , σ 2 ), μ ∈ ℝ , σ 2 > 0
n
• σ known, μ unknown: S n = ∑ X i is sufficient for μ .
2
i =1
2
• both σ and μ unknown: X , S are jointly sufficient for ( μ , σ 2 )'
2
Completeness
Examples:
n
r. s.
1. Let X ∼ Be( p) , p ∈ [0, 1] . T = T (X ) = ∑ X i is complete.
i= 1
r. s.
2. Let X 1 , X 2 ∼ Be ( p) , p ∈ [0, 1] . T = T ( X ) = X 2 − X 1 is NOT complete.
r . s.
3. Let X 1 ∼ N (0, θ) . T = T ( X ) = X 1 is NOT complete.
a) Be ( p) and Geo( p)
b) Po( λ )
c) Exp( λ )
d) N ( μ , σ 2 ) , σ 2 known
e) Bi (n , p)
5
2. k – parameter exponential family of densities
{∑ }
k
f X (⋅; θ) = a (θ)⋅b(x )⋅exp c j (θ)⋅d j (x ) , ∀ θ ∈ Ωθ
j =1
a) N ( μ , σ 2 )
b) Beta (a , b)
c) Ga(r , λ )
Theorem
r . s.
Let X ∼ f X (⋅; θ) , where f X (⋅; θ) belongs to the 1 – parameter exponential family of densities.
Then, a complete (minimal) sufficient statistic for θ is given by:
n
S = S (X ) = ∑ d( xi )
i =1
Theorem
r . s.
Let X ∼ f X (⋅; θ) , where f X (⋅; θ) belongs to the k – parameter exponential family of densities.
Then, a joint complete (minimal) sufficient set of statistics for θ is given by:
n n n
∑ d 1(x i ) , ∑ d 2 (x i), … , ∑ d k ( xi )
i =1 i=1 i =1
6
r. s.
2. X ∼ Geo( p) , p ∈ [ 0, 1]
p x ( x) = p(1 − p)x I {0 , 1 , …}( x)
= p I {0 , 1 , …}(x )exp {ln [(1 − p) x ]}
= p I {0 , 1 , …}(x )exp {x ln(1 − p)}
n n
CSS: ∑ d ( xi ) = ∑ x i = S n
i=1 i=1
r. s.
3. X ∼ Po( λ ) , λ > 0
e λx
−λ
p X ( x) = I ( x)
x ! {0 , 1 , …}
1
= e− λ I (x ) exp {ln ( λ x )}
x ! {0 , 1 , …}
−λ 1
= e I (x ) exp {x ln ( λ )}
x ! {0 , 1 , …}
n n
CSS: ∑ d ( xi ) = ∑ x i = S n
i=1 i=1
r. s.
4. X ∼ Exp( λ ), λ > 0
f X ( x) = λ e−λ x I (0 , ∞) ( x)
= λ I (0 , ∞) ( x) exp {− λ x }
n n
CSS: ∑ d ( xi ) = ∑ x i = S n
i=1 i=1
r. s.
5. X ∼ N ( μ , σ 2 ), μ ∈ ℝ , σ 2 > 0 , σ 2 known
f X ( x) =
1
σ √2 π
exp −
{1
2σ 2 }
( x − μ )2 I (−∞, ∞) ( x)
=
1
σ √2 π
exp −
{1
2σ 2
}
( x 2 − 2 μ x + μ 2 ) I (−∞ , ∞) (x)
=
1
σ √2 π
exp −
{1 2 1
2σ 2
x + 2μx−
σ
1
2σ 2 }
μ 2 I (−∞ , ∞) ( x)
7
= exp −
{ 1 2
2σ 2
x
} 1
σ √2 π
I (−∞, ∞) ( x) exp −
1
2σ 2
μ2
{ } {
exp
1
σ2
μx }
b(x) a(μ) c(μ) d(x)
n n
CSS: ∑ d ( xi ) = ∑ x i = S n
i=1 i=1
r. s.
6. X ∼ Bi(n , p), n ∈ ℤ, p ∈ [0, 1]
x ()
p X ( x) = n p x (1 − p)n − x I {0 , 1 , …, n }( x)
x
= (n ) (1 − p) (
p)
p
n
I ( x) {0 , 1 , …, n }
x 1 −
{ 1 − p ) ]}
( x ) exp ln [(
x
p
= (1 − p) ( n ) I
n
{0 , 1 , …, n}
x
( x)
= (1 − p) n I n
{ 1 − p )}
( x ) exp x ln (
{0 , 1 , …, n}
p
n n
CSS: ∑ d ( xi ) = ∑ x i = S n
i=1 i=1
r. s.
7. X ∼ N ( μ , σ 2 ), μ ∈ ℝ , σ 2 > 0 . Both parameters unknown
f X ( x) =
1
σ √2 π
exp −
1 2 1
2σ 2 {
x + 2μx−
σ 2σ
1
2
μ 2 I (−∞ , ∞) ( x)
}
=
1
σ √2 π
exp −
1
2σ 2
{ }
μ 2 I (−∞ , ∞)( x) exp −
2σ
1
2
1
x2 + 2 μ x
σ { }
a(μ, σ2) b(x) c1(σ2) d1(x) c2(μ, σ2) d2(x)
n n n n
Joint CSS: ∑ d 1 (x i ) = ∑ x 2
i , ∑ d 2 ( xi ) = ∑ x i = S n
i=1 i =1 i=1 i=1
r. s.
8. X ∼ Beta( a , b), a > 0 , b > 0
1
f X ( x) = x a − 1 (1 − x)b − 1 I (0, 1) ( x )
B (a , b)
1
= I (0, 1) (x ) exp {ln [ x a − 1 (1 − x )b − 1 ]}
B (a , b)
1 a−1 b− 1
= I (0, 1) (x ) exp {ln ( x ) + ln [(1 − x ) ]}
B (a , b)
8
1
= I ( x ) exp {( a − 1)ln( x) + (b − 1)ln (1 − x)}
B (a , b) (0, 1)
n n n n
Joint CSS: ∑ d 1 (x i ) = ∑ ln(x i ) , ∑ d 2 ( x i) = ∑ ln(1 − x i )
i=1 i =1 i=1 i=1
r. s.
9. X ∼ Ga(r , λ ), r > 0 , λ > 0
f X ( x) = λr x r − 1 e−λ x I (0 , ∞) ( x )
Γ(r )
r
= λ I (0 , ∞) (x ) exp {ln ( x r − 1)} exp {−λ x }
Γ(r )
r
= λ I (0 , ∞) (x ) exp {ln ( x r − 1) − λ x}
Γ(r )
r
= λ I (0 , ∞) (x ) exp {(r − 1) ln( x ) − λ x }
Γ(r )
n n n n
Joint CSS: ∑ d 1 (x i ) = ∑ ln( x i ) , ∑ d 2 ( x i) = ∑ x i = S n
i=1 i =1 i=1 i=1
Unbiasedness
Definition: If T = T ( X ) is an estimator of τ (θ) , the bias of T with respect to τ (θ) is defined as:
bτ (θ )(T ) = τ (θ) − E (T )
Minimum MSE
9
UMVUE (Uniformly Minimum Variance Unbiased Estimator)
i. E (T ) = τ (θ) ; and
ii. V (T ) ≤ V (T * ) , for any other unbiased estimator T * of τ (θ) .
Some Results:
2. If T 1 = T 1( X ) is the UMVUE for τ 1 (θ) and T 2 = T 2 ( X ) is the UMVUE for τ 2 (θ) , with T 1
and T 2 independent, then (a 1 T 1 + a 2 T 2) is the UMVUE for [a1 τ 1 (θ) + a 2 τ 2 (θ)]
[ τ ' (θ)]2
CRLB =
n I (θ)
n = sample size
[( )]
2
= E ∂ ln f X ( x i ; θ)
∂θ
[ ]
2
= E ∂ ln f ( x ; θ)
X i
∂θ2
Rao-Blackwell Theorem
r . s.
Let X ∼ f X (⋅; θ) , θ ∈ Ωθ ⊂ ℝ (i.e., θ is a scalar). Further, let S = S ( X ) be a sufficient statistic for
θ and let T = T ( X ) be an unbiased estimator for τ (θ) , with finite variance V (T ) < ∞ . Define T ' as
T ' = E (T | S ) . Then,
10
Lehmann-Scheffé Theorem
r . s.
Let X ∼ f X (⋅; θ) , θ ∈ Ωθ ⊂ ℝ (i.e., θ is a scalar). If the statistic S = S ( X ) is complete (minimal)
and sufficient for θ and there exists an unbiased estimator T = T ( X ) for τ (θ) , then there exists a
unique UMVUE for τ (θ) ,given by: T * = E (T | S ) .
Efficiency
Definitions:
1. An unbiased estimator T = T ( X ) of τ (θ) is defined to be efficient for τ (θ) , iff, the variance
of T attains the CRLB of the variances of unbiased estimators of τ (θ) .
CRLB
Eff (T ) = × 100 %
Var (T )
3. If T 1 and T 2 are both unbiased for τ (θ) , with Var (T 1) ≤ Var (T 2) , then, the relative efficiency
(REff) of T 2 with respect to T 1 is defined as,
Var (T 1)
REff = × 100 %
Var ( T 2)
ii. Eff ( T ) → 1 as n → ∞
11
(f) Location-scale equivariant, iff,
+
T (c 1 X + c 2 1 n ) = c1 T ( X ) + c 2 , ∀ x ∈ S , ∀ c 1 ∈ ℝ ,and ∀ c 2 ∈ ℝ
Robustness: An estimator that performs well under modifications of the underlying assumptions is
said to be robust.
Ancillarity
r . s.
Let X ∼ f X (⋅;θ). Let the statistic T = T ( X ) be sufficient for θ , and suppose dim(T ) > dim(θ) . If T
can be written as T = (T 1 , T 2) where T 2 = T 2 ( X ) has a marginal distribution which is independent of
θ , then the statistic T 2 is defined to be an ancillary statistic. Moreover, the statistic T 1 = T 1( X ) is
called a conditionally sufficient statistic.
i. T * is a linear function of X 1 , X 2 , … , X n ;
ii. E (T * ) = τ (θ) , i.e., T * is unbiased for τ (θ) ; and
iii. V (T * ) ≤ V (T ), ∀ θ ∈ Ωθ for any other linear unbiased estimator T of τ (θ) .
Remark: The BLUE is the counterpart of the UMVUE if we restrict ourselves with linear estimators.
r . s.
Result: Let X ∼ f X (⋅ ; μ , σ 2) , with σ 2 < ∞ . Then X is the BLUE/UMVULE for μ , regardless
of the form of f X
➢ Equate the first k sample raw moments (if θ has k components) to their corresponding raw moments,
i.e.
12
~ ~ ~
➢ The solutions, denoted by θ 1 , θ 2 , … , θ k , are the MMEs of the parameters θ1 , θ 2 , … , θk ,
respectively.
n
∑ X ri
r i=1
Recall: μ r ' = E ( X ) M r' =
n
L(θ | X = x) = L(θ | X 1 = x 1 , X 2 = x 2 , … , X n = x n )
n
= ∏ f X (⋅ ; θ)
i=1
r . s.
➢ Let X ∼ f X (⋅ ; θ), θ ∈ Ω θ and let L(θ | X ) be the likelihood function of the random
sample. The estimator
θ^ = ( θ^ 1 , θ^ 2 , … , θ^ k )'
i. Use ∂ L(θ | X = x ) = 0
∂θ
iii. Sometimes differentiation might not work (e.g. the uniform distribution)
Theorem:
Let θ^ = ( θ^ 1 , θ^ 2 , … , θ^ k )' be the MLE of θ . Suppose τ 1 (θ), τ 2 (θ), … , τ r (θ) , 1 ≤ r ≤ k , are r
functions of the k unknown parameters. Then, the MLEs of these r functions are the same functions
evaluated at the MLEs of the k unknown parameters, i.e., the MLEs of τ 1 (θ) , τ 2 (θ), … , τ r (θ) are
^ , τ 2 (θ),
τ 1 ( θ) ^ … , τ r (θ). ^
Remark: The MLE chooses as “best” the mode of the distribution or the modal value of θ
13
Theorem: If the parent PDF/PMF satisfies certain regularity conditions, the MLE of τ (θ) say
^ is asymptotically normally distributed.
τ^ (θ) = τ ( θ)
Some Results:
2. MLEs are asymptotically unbiased and asymptotically efficient for the target parameter, and are
also asymptotically normally distributed.
3. If an MLE is unbiased for a target parameter and its variance attains the corresponding CRLB,
then it is the UMVUE.
4. If an MLE is unbiased for a target parameter and if it is a function of the CSS, then it is the
UMVUE.
14