0% found this document useful (0 votes)
34 views

Sample Mean R.V. ̅ : Bern/bin

This document defines statistical concepts such as standard deviation, variance, normal distribution, binomial distribution, central limit theorem, correlation, and hypothesis testing. It provides formulas for calculating measures of central tendency (mean, median), dispersion (range, interquartile range, standard deviation), probability (binomial probability), correlation (sample correlation coefficient), and hypothesis testing (z-scores, p-values, critical regions). It also summarizes properties and relationships between random variables, such as how the mean and variance of a sum of independent random variables relates to the individual random variables.

Uploaded by

Amelie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Sample Mean R.V. ̅ : Bern/bin

This document defines statistical concepts such as standard deviation, variance, normal distribution, binomial distribution, central limit theorem, correlation, and hypothesis testing. It provides formulas for calculating measures of central tendency (mean, median), dispersion (range, interquartile range, standard deviation), probability (binomial probability), correlation (sample correlation coefficient), and hypothesis testing (z-scores, p-values, critical regions). It also summarizes properties and relationships between random variables, such as how the mean and variance of a sum of independent random variables relates to the individual random variables.

Uploaded by

Amelie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Deviation: 𝛴(𝑥𝑖 − ̅̅̅

𝑥) = 0 𝛴𝑥𝑖 Normal data, σ known


Sample mean r.v. 𝑥̅ =
Percentile: (perc%)*n then
𝑛+ (𝑛+1) 𝑛2 𝜎2 𝑋̅ − 𝜇
2 𝜎 𝜎 𝑋̅ ~𝑁 (𝜇, ) ⟨=⟩𝑍 = ~𝑁 (0,1)
ⅈ 𝐸 ( 𝑋̅) = 𝜇 Var ( 𝑋̅) = 𝑆𝐷 (𝑋̅) = 𝑛 𝜎 ∕ √𝑛
𝑛 𝑡ℎ
= 100 × 𝑛 √𝑛 𝜎
(𝑛 + 1 ) 𝑍(1− 𝛼) = margin of error
√𝑛
Unimodal: one peak 2
Bimodal/multi: 2+ peaks Almost normal data, σ known
̅
Distribution of 𝑿
Uniform: many 𝜎2 𝑋̅ − 𝜇
Right skewed (bulk on the left, long tail on right) Normal data for any n, if 𝑋𝑖 ~𝑁 (𝜇, 𝜎 2 ), then 𝑥̅ 𝑎 ~𝑁 (𝜇 , ) ⇒ 𝑍 = 𝑎~𝑁 (0,1)
𝑛 𝜎 ∕ √𝑛
Left skewed 𝑋1 + ⋯
Symmetric Normal data, unknown σ
𝜎2 ̅−𝜇
Mode most numerous value + 𝑋𝑛 ~𝑁(𝑛𝜇, 𝑛𝜎 2 ) 𝑋̅~𝑁 (𝜇, )
𝑇=
X
~𝑇𝑛 − 1
Median central value 𝑛
∑(𝑥𝑖 −𝑥̅)2
√𝑆 2 ∕ 𝑛
Population variance: 𝜎 2 = CLT: For large n, whatever distribution of Xi,
𝑁 Confidence interval (Bern/Bin)
∑(𝑥𝑖 −𝑥̅)2 𝜎2
Sample variance: 𝑠 2 = 𝑋1 + ⋯ + 𝑋𝑛 ~𝑁 (𝑛𝜇, 𝑛𝜎 2 ) 𝑋̅ 𝑎~𝑁 (𝜇 ) 𝑝 (1 − 𝑝)
𝑛−1 𝑛 𝑥̅ = 𝑝̂𝑛 𝑎~𝑁 (𝑝, )⇒𝑧
𝑛
𝜎2 𝑋̅−𝜇
Standard deviation: 𝑠 = √𝑠 2 = √
∑ (𝑦 𝑖−𝑦̅)2 If ⅈ𝑓 𝑋̅ ~𝑎𝑁 (𝜇 ) 𝑡ℎ𝑒𝑛 𝑍 = ~𝑁 (0,1) 𝑃̂ 𝑛 − 𝑝
𝑛−1
𝑛 𝜎∕√𝑛
= 𝑎~𝑁 (0,1)
INDIPENDENT ( )
√𝑝 1 − 𝑝
SAMPLE SIZE/NON NORMALITY 𝑛
IQR(interquartile range)= Q3(75th)-Q1(25th) 𝑖∑ ( 𝑋 −𝑋̅ ) 2 1 SAMPLE TESTS:
Sample Var: 𝑠 2 =
Whiskers: max upper= Q3+1.5*IQR (lower -1.5) 𝑛 −1 H1: μ > μ0 p-value = 𝑃𝑟 (𝑍 > 𝑧 𝑂𝑏𝑠 |𝐻0 )
(𝑛 − 1) 𝑆 2 ∑( 𝑥 𝑖 − 𝑥) 2
(𝑥̅ − 2𝑠,̅𝑥 + 2𝑠) ≈ 95% = CR(+∞)
∑(𝑥𝑖 −𝜇𝑥)(𝑦 𝑖 −𝜇𝑦 ) 𝜎2 𝜎2 H1: μ < μ0 p-value = 𝑃𝑟 (𝑍 < 𝑧 𝑂𝑏𝑠 |𝐻0 )
Population cov: 𝜎𝑥𝑦 = 𝑥 𝑖 − 𝑥̅
𝑁
∑ (𝑥𝑖 −𝑥̅)(𝑦 𝑖−𝑦̅) =∑( ) ~𝑥 2𝑛−1 CR(-∞)
Sample cov: 𝑠𝑥𝑦 = 𝜎 H1: μ ≠ μ0 p-value = 2min {𝑃𝑟 (𝑍 <
𝑛−1
𝑠𝑋𝑦 ∑(𝑋𝑖 −𝑋̅)( 𝑦 𝑖−𝑦̅) Independent r.v. 𝑃𝑟 (𝑋2 = 1|𝑋1 = 1) = 𝑧 𝑜𝑏𝑠 | 𝐻0 ), 𝑃𝑟 (𝑍 > 𝑧 𝑂𝑏𝑠 |𝐻0 )}
Sample corr coeff: 𝑟 = = 𝑁𝑝−1 𝑁𝑝
𝜎𝑥𝑦
𝑠𝑋 𝑠𝑌 √ ∑(𝑋𝑖 −𝑋̅ )2𝛴 ( 𝑦 𝑖−𝑦̅) 2 𝑃𝑟 (𝑋2 = 1| 𝑋1 = 0) = CR(-inf,+inf)
𝑁−1 𝑁 −1
Pop corr coeff: 𝑝 =
𝜎𝑥 𝜎𝑦
Z𝑜𝑏𝑠 ∈ CR( 𝑧1 − 𝑎; +∞) 𝑝𝑣𝑎𝑙𝑢𝑒 < 𝑎
Var: 𝜎 2 = Var (𝑥) = ∑([ 𝑋 − 𝐸 (𝑋 )] 2 ) =
Binomial/Bern [𝑋𝑖 ~ 𝐵𝑒𝑟𝑛 (𝑝) , 𝑃 ∈ (0,1) ] reject H0
𝐸 (𝑋 2 ) − 𝐸 (𝑋) 2
𝑇 = 𝑋1 + ⋯ + 𝑋𝑛 ~𝐵ⅈ𝑛 (𝑛, 𝑝) Z𝑜𝑏𝑠 ∈ CR( 𝑧1−𝛼 ; +∞) 𝑝𝑣𝑎𝑙𝑢𝑒 > 𝛼
SD: 𝜎 = 𝑆𝐷 (𝑥) = √Var (𝑥) fail to rej H0
𝑥 𝑖 ~𝑁 (𝑛𝑝, 𝑛𝜎 2 )
Properties: 𝐸 (𝑎𝑥 + 𝑏) = 𝑎𝐸 (𝑥) + 𝑏
𝐸 𝑇 = 𝑛𝑝 = 𝑛 × 𝐸 (𝑥 𝑖 ) = 𝑛𝜇
( )
Var (𝑎𝑥 + 𝑏) = 𝑎2 Var (𝑥) Var (𝑇) = 𝑛𝑝 (1 − 𝑝) = 𝑛 × Var ( 𝑥 𝑖 ) = 𝑛𝜎 2 Known Variances – normal data (any
𝐸 (𝑥 ± 𝑦) = 𝐸(𝑥) ± 𝐸 (𝑦) size)
Var (𝑥 ± 𝑦) = Var (𝑥) + Var (𝑦) (𝑥̅ − 𝑦̅ ) − 𝑑 0
By CLT if n large enough 𝑇 𝑎 ~𝑁 (𝑛𝜇, 𝑛𝜎 2 ) 𝑍= ~𝑁 (0,1)
Normal r.v.: X~𝑁 (𝜇 , , 𝜎 2) 𝜇 = 𝐸( 𝑥 𝑖 ) = 𝑝 𝜎 2 𝜎𝑦2
√ + 𝑥
1
1 𝑢−𝜇 2 𝜎 2 = var( 𝑥 𝑖 ) = 𝑝(1 − 𝑝) 𝑛 𝑚
− ( )
𝑓𝑥 (𝑥) = 𝑒 2 𝜎 𝑛𝑝 ≥ 10 𝑛𝑝 (1 − 𝑝) ≥ 10 Known Variances – nearly normal data
√2𝜋𝜎
𝐸(𝑥) = 𝜇 Unbiased estimator 𝐸 (𝑓 (𝑥) ) = 𝜃 (large samples)
𝑉𝑎𝑟 = 𝜎 2 ̅) = 𝜎 ( 𝑥̅ − 𝑦̅ ) − 𝑑 0
Standard Error S𝐸 (X̅ ) = 𝑆𝐷( X
𝑍= 𝑎~𝑁 (0,1)
√𝑛
𝑆𝐷 (𝑥) = 𝜎 𝜎 2 𝜎𝑦2
𝑥−𝜇 𝑝 (1−𝑝 ) √ + 𝑥
If 𝑋~𝑁 (𝜇, 𝜎 2 ) then 𝑧 = ~𝑁 (0,1) 𝑆𝐸 (𝑃𝑛 ) = 𝑆𝐸 (𝑥̅ ) = √ max=0.5 𝑛 𝑚
𝜎 𝑛
Symmetry: P𝑟 𝑍 ≤ −𝑧 = 1 − 𝑃𝑟 (𝑍 ≤ 𝑧) =
( ) 1 Unknown unequal var – large sample
𝑢𝑝𝑝𝑒𝑟 𝑏𝑜𝑢𝑛𝑑 = 𝑆𝐸(𝑝𝑛 ) < 𝑎 ( 𝑋̅ − 𝑌̅) − 𝑑 0
𝑃𝑟 (𝑍 ≥ 𝑧) 2√ 𝑛 𝑧= 𝑎~𝑁 (0,1)
X, Y independent normal X~𝑁( 𝜇𝑥, 𝜎𝑥2 ) and 𝑌 ~𝑁(𝜇𝑦, 𝜎𝑦2 ) 2 𝑠2
√𝑠𝑋 + 𝑦
𝑥 ± 𝑦~𝑁(𝜇𝑥 ± 𝜇 𝑦, 𝜎𝑥2 + 𝜎𝑦2 ) Let Xn each X𝑖 ~𝑁 (𝜇, 𝜎 2 ) 𝑛 𝑚
𝑃𝑟(−2 < 𝑥 < 2) ∑ (𝑥 𝑖 − 𝑥̅ )2 Unknown equal var – any size sample
𝑆2 = ( 𝑥̅ − 𝑦̅ ) − 𝑑 0
= 𝑝𝑟 (𝜇 − 2𝜎 < 𝑥 𝑛−1 𝑇= ~𝑇𝑛 + 𝑚 − 2
< 𝜇 + 2𝜎) = 0.95 1 1
√𝑠𝑝2 ( + )
Confidence interval: 𝑛 𝑚
Chi-square 𝑋~𝑋𝑛2 (n=degree of freedom) 0.95 = Pr(-1.96<Z<1.96) (𝑛 − 1) 𝑆𝑥2 + (𝑚 − 1) 𝑆𝑦2
𝑆𝑝2 =
𝑥 = ∑𝑧𝑖2 ~𝑥 2𝑛 Ex: n+m−2
( )
𝐸 𝑥 =𝑛 Var(𝑥) = 2𝑛 pop~N(146,2.4) CR (−∞, −𝒕𝒏+𝒎 −𝟐 )
sample n=81 X=142.23 SD(X)= 2.3
Binomial r.v. X~𝐵ⅈ𝑛 𝑛, 𝑝)
( Paired – normal data – any size
𝑛 1. Expected value → X~(146,2.4/√81)
𝑝𝑥 (𝑥) = 𝑃𝑟 (𝑥 = 𝑥) = ( ) 𝑝 𝑥 (1 − 𝑝) 𝑛 −𝑥 𝐷̅ − d0
𝑥 2. Size sample such SD<0.61 𝑇= ~𝑇𝑛 −1
𝐸 (𝑥) = 𝑛𝑝 Var(𝑥) = 𝑛𝑝 (1 − 𝑃) σ/√n<0.61 𝑆𝐷 ∕ √𝑛
𝑛 = 1 𝑋~𝐵ⅈ𝑛(1, 𝑝) → X~𝐵𝑒𝑟𝑛 (𝑝) how large sample if sample mean differs from where D=X-Y
actual by no more than 0.0005? ME<0.0005 𝐷ⅈ~𝑁(𝜇1 − 𝜇 2, 𝜎𝐷2 )
𝜎𝐷 = 𝜎12 + 𝜎22
Bern/bin
𝑛𝑝̂ 𝑥 + 𝑚𝑝̂𝑦 0
𝜌̂ = 𝑧 𝑏𝑠
𝑛+𝑚
(𝑃̂𝑥 − 𝑝̂𝑦 )
=
1 1
√𝑝̂ (1 − 𝑝̂ ) ( + )
𝑛 𝑚

You might also like