Stat A4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

STAT1103 Finals — Ygnacio 4745379

SUMMARISING DATA

Categ uni/
bivariate
Create frequency table tab1 var1 OR tabulate var1 var2

Create bar graph graph bar(count/percent), over(var1) OR


over(var2)
Create pie chart graph pie, over(var1)

Numeric
uni/bivariate
Descriptive stats summary summarise var1, detail OR tabstat var1,
statistics (n mean sd median iqr range)

Create histogram histogram var1 OR histogram var1,


by(category) freq ex. Gender
Create boxplot one variable graph box var1

Comparative boxplot graph box var2, over(var1) ******[var2 is


numerical, var1 categorical]
Create scatterplot Scatter var1 var 2 OR graph matrix var1
var2 var3
scatter graph twoway (scatter y x) (l t score age)
*l t for trend line
jitter (un-overlay points in jitter(number) ex. 7
scatterplot)

ONE-SAMPLE TESTS

Z-test ztest var1 == pop mean, sd(given standard dev)


z-score (Observed score - mean)/sd

T-test ttest var1 == pop mean

x2 gof [to get chi2 test statistic] csgof var1, expperc(exp freq1, expfreq2, H0: observed distrib
expfreq3) same as expected
distrib
[to display] display chi2tail(df,test statistic) df = categs-1

TWO-SAMPLE TESTS

Independent Two-sample ttest var1, by(category)


T-test
Correlation test pwcorr var1 var2, sig
Chi-squared test of tabulate var1 var2, chi2 row Testing association: are proportions of
independence expected this variable di among values of other
variable? 0: association, 1: no assoc
chi2tail(df, test statistic) df = (rows-1 x
(cols -1)
expected frequency: (row total x
column total) / grand total
McNemar’s test mcci tablecontent1 2 3 4 Testing related groups categorical

ASSUMPTION TESTS

Shapiro-Walk normality swilk var1 OR by categoricalvar, sort: swilk


test numericvar1
NORMALLY distrib if p>0.05
fi
ff
fi
Levene’s test robvar var1, by(category) &W

EQUAL var if p>0.05

EFFECT SIZES

Cohen’s d tests related to means


1sample t (M-μ) / sd (sample mean - hypothesised popu mean /
sample standard dev)
2sample t (x̄ 1-x̄ 2) / sp (mean 1 - mean 2 / pooled standard dev)
Paired t Md/sd (sample mean di erence / standard dev of
di erences)
Cohen’s W √(x2/N) (√(chi test statistic / sample size))

CONFIDENCE INTERVALS

Critical value: value of z/t that cuts o 5% in both tails of the normal distrib

P-value sig: interval EXcludes mew || P-value not sig: interval INcludes mew

General form for con dence interval: Sample estimate +/- critical value x standard error

M +/- critical value x (variance/√n)

If p LESS OR EQUAL </= 0.05 con dence interval


If p GREATER THAN > 0.05 con dence interval,

• REJECT null, YES di erence, SIGNIFICANT


• FAIL to reject null, NO di erence, NOT signi cant

• swilk NOT normal distributed


• swilk NORMAL distrib

• Robvar NOT EQUAL vars • robvar EQUAL vars

Steps

1. How many variables?

2. What kind of variables? categorical/numerical?

3. What test appropriate?

4. What hypos in this test?

5. Normally distributed? swilk

6. Equal variance? Lev’s

7. Reject/fail H0?

8. Signi cant di ?

Summary

Looking at di erences:

T-tests (numerical) Chi-square tests


(categorical)
Di bw known value One-sample Goodness of t
Independent groups Independent two-sample Test of independence
Related groups ex. Siblings, Paired t-test McNemar’s
same people measured twice; usually
speci ed

Looking at similarities: correlation

ff
ff
fi
fi
ff
fi
ff
ff
ff
fi
ff
fi
fi
fi
ff

You might also like