2022 Lecture4 Part1
2022 Lecture4 Part1
2022 Lecture4 Part1
groups
• Different entities
- Participants who received actual medication vs. those who
received a placebo
• Same or related entities
- Students' knowledge before and after this lecture
How to compare two means?
• Different entities
- Independent t-test (independent-measures t-test;
independent-means t-test)
• Same or related entities
- Paired-samples t-test (dependent t-test)
How to compare two means?
• Comparing differences between the means of two
groups means predicting an outcome based on
membership of two groups
• We can use the linear model with a dichotomous
predictor (also known as dummy variable)
- Yes or no
- Treatment or no treatment
- Lecture or not
- Cloak or no cloak
- 0 or 1
How to compare two means?
• The t-test tells us whether the difference between
means is different from zero (= something is going on!)
• Best predicted value of the outcome is the group
mean (summary statistic with the least squared error)
How to compare two means?
• You're asking yourself: do rabbits eat more carrots than
other animals?
Categorical predictors in the linear model
𝑌𝑖 = 𝑏0 + 𝑏1 𝑋1𝑖 + 𝜀𝑖
Carrots𝑖 = 𝑏0 + 𝑏1 Rabbit 𝑖 + 𝜀𝑖
Categorical predictors in the linear model
• Group variable = 0 (no rabbit)
• b0 = mean of baseline (no rabbit) group = intercept
Carrots𝑖 = 𝑏0 + 𝑏1 Rabbit 𝑖
𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑏0 + 𝑏1 ∗ 0
𝑏0 = 𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡
Categorical predictors in the linear model
• Group variable = 1 (Rabbit)
• 𝑏1 = difference between group means
Carrots𝑖 = 𝑏0 + 𝑏1 Rabbit 𝑖
𝑋𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑏0 + 𝑏1 ∗ 1
𝑋𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑏0 + 𝑏1
𝑋𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡 + 𝑏1
𝑏1 = 𝑋𝑅𝑎𝑏𝑏𝑖𝑡 − 𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡
The logic behind the t-test
• If samples come from the same population, we expect large differences
between sample means to occur very infrequently
• Under H0, we expect means from two random samples to be very similar
• We compare the difference between the sample means that we collected to
the difference between the sample means that we would expect to obtain
(in the long run) if there was no effect
• If the difference between the samples we have collected is larger than we
would expect (based on the standard error), then one of two things has
happened
- There is no effect, but sample means from our population fluctuate a lot and we
happen to have collected two samples that produce very different means
- The two samples come from different populations, which is why they have
different means and this difference indicates an actual difference between the
samples, and H0 is unlikely
The logic behind the t-test
• Two samples with two means, which differ by a little or a lot
• Compare difference between sample means we obtained to
expected sample means if there was no effect ( = other animals
eat as many carrots as rabbits)
• Signal-to-noise ratio: (systematic) variance explained by the
model divided by (unsystematic) variance the model cannot
explain
• How large is the observed difference between the sample
means (relative to the standard error)?
• The larger it is (relative to the standard error), the more likely it
is that the two means differ due to different conditions
The logic behind the t-test
Model (Signal)
Error (Noise)
Independent t-test: Example
• Are invisible people mischievous?
• Experiment
- Participants placed in enclosed community full of hidden
cameras
- 12 participants with invisibility cloak
- 12 participants without invisibility cloak
• How many mischievous acts did participants perform
in a week?
What does a suitable dataset look like?
What does a suitable dataset look like?
The independent t-test in SPSS
The independent t-test in SPSS
The independent t-test in SPSS
The independent t-test in SPSS