May 7 2024

Reliability
Outline
• Reliability – what it represents conceptually
• Reliability – three main approaches to estimate it
1. Test-retest See also the “hand notes” ppt
2. Equivalent-forms
3. Internal consistency
3.a. Split-half
3.b. Cronbach`s alpha
alpha function in R
• Two important uses of reliability
1. Correction for attenuation
2. Standard error of measurement, and confidence intervals
• How to increase Cronbach`s alpha?
• Do we always want a high Cronbach`s alpha?
• What alpha can and cannot tell us about. And how to approach it?
Two important uses of
reliability
1. Correction for attenuation.
2. Calculating SEMs and confidence intervals around observed scores.
Correction for attenuation
• Correcting observed correlations for measurement error/ amount of
unreliability.
This is one of the most important uses of reliability theory: it allows

researchers to estimate the relationship between two constructs as if
they were measured perfectly reliably (free from random error).
• The original development of reliability theory was to estimate the

true correlation between latent variables.
• What is the correlation between neuroticism and depression?
• Reliability theory is developed so that correlations between observed

scores (observed correlations) can be corrected for their unreliability.
• Correlation between observed neuroticism scores and depression scores
corrected for measurement error.
Reliability Reliability
of Scale X of Scale Y
Example: Observed correlation between Neuroticism and Depression scores

are 0.3. Neuroticism scale scores have a reliability of 0.7. Depression scale
scores a reliability of 0.88. Estimate the true correlation between
neuroticism and depression. Corrected
correlations are
always higher in
magnitude.
By correcting for unreliability in this way we
are able to estimate the underlying latent
Correction for attenuation relationships without the distraction of
measurement error.
Reliability Reliability
of Scale X of Scale Y
Example: Observed correlation between Neuroticism and Depression scores

are 0.3. Neuroticism scale scores have a reliability of 0.7. Depression scale
scores a reliability of 0.88. Estimate the true correlation between
neuroticism and depression. Corrected
correlations are
always higher in
magnitude.
Standard error of observed score
(a.k.a. standard error of measurement; SEM)
• Observed scale score is the best estimate of a person`s true score. But how
confident are we with this estimate?
• SEM estimates how spread out the observed scores of a person would be around
his/her true score if we were to measure the same person repeatedly using same
instrument (and erasing his/her memory between the testing sessions).
• Conceptually, it represents the within-person SD of observed scores across infinitely
repeated measurements.
• SEM estimates how repeated measures of a person on the same instrument would be distributed
around his/her “true” score.
• Practically, where SD is the between-subject standard deviation of observed scores.

• SEM is lower when reliability is higher.
Confidence interval (CI)
We can use SEM to draw boundaries (confidence interval) around our

point-estimate for a desired level of confidence.
General formula for CIs: Observed Score ± z*SEM
Our point- Value of z depends on the

estimate of the confidence level we want.
“true score” Typically,
z = 1.96 for 95% confidence
z = 2.58 for 99% confidence
Calculating CI
Example: A person scores 100 on a test with an SEM of 2. What is the
95% confidence interval for the spread of scores?
Solution:
95%CI = Score ± 1.96*SEM
95%CI = 100 ± 1.96*2  [96.8, 103.92]
Interpretation: If the person take this test infinite number of times

(memory erased), 95% of the time his/her score would fall between
96.8 and 103.92.
𝑆𝐸𝑀=𝑆𝐷 √ 1−𝑟𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦
Calculate SEM
• An IQ test has a SD of 15 and reliability of .7. What is the SEM for the
test scores?
• SEM = 15* = 8.22

𝑆𝐸𝑀=𝑆𝐷 √ 1−𝑟𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦
Calculate SEM
test scores?
• SEM = 15* = 8.22
A person scores 110 on this IQ test. What is the 99% confidence interval of this score?
Solution: 110 ± 2.58*8.22  99% CI = [88.8, 131.2]
z-value for 99% SEM that is

confidence. calculated above.
Calculate SEM
test scores?
• SEM = 15* = 8.22
Solution: 110 ± 2.58*8.22  99% CI = [88.8, 131.2]

confidence. calculated above. What if the reliability were 0.8 instead of 0.7?
Calculate SEM
test scores?
• SEM = 15* = 6.71
Solution: 110 ± 2.58*6.71  99% CI = [92.7, 127.3]

Calculate SEM
test scores?
• SEM = 15* = 6.71
Now we are more
Solution: 110 ± 2.58*6.71  99% CI = [92.7, 127.3]
precise. This is a smaller
interval.
Now with reliability of .92.
Calculate SEM
test scores?
• SEM = 15* = 4.24
Precision improves with
Solution: 110 ± 2.58*4.24  99% CI = [99.1, 120.9]
increased reliability.

Now with reliability of .92.
Calculate SEM
test scores? When reliability = 0, SEM will equal the SD of
the observed IQ scores.
When reliability = 1.00, SEM becomes zero.

• SEM = 15* = 4.24 No need to draw CIs, measurement is perfect,
we are 100% confident with the point-
estimate of 110.
Precision improves with
Solution: 110 ± 2.58*4.24  99% CI = [99.1, 120.9]
increased reliability.

Outline
• Reliability – what it represents conceptually
• Reliability – three main approaches to estimate it
1. Test-retest See also the “hand notes” ppt
2. Equivalent-forms
3. Internal consistency
3.a. Split-half
3.b. Cronbach`s alpha
alpha function in R
• Two important uses of reliability
1. Correction for attenuation
2. Standard error of measurement, and confidence intervals
• How to increase Cronbach`s alpha?
• Do we always want a high Cronbach`s alpha?
• What alpha can and cannot tell us about. And how to approach it?
How to increase Cronbach`s alpha?
2
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 x 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑖𝑛𝑡𝑒𝑟𝑖𝑡𝑒𝑚𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠
𝑎𝑙𝑝h𝑎=
𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑡h𝑒
𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑎𝑡𝑟𝑖𝑥 (𝑖𝑛𝑐𝑙𝑢𝑑𝑖𝑛𝑔 1 𝑠)
Item 1 Item 2 Item 3 Item 4

Item 1 1 r2,1 r3,1 r4,1
Inter-item correlation matrix example where
Item 2 r1,2 1 r3,2 r4,2 diagonals are all 1 and off-diagonals are
Item 3 r1,3 r2,3 1 r4,3 pairwise correlations (e.g., r1,2 is the
Item 4 r1,4 r2,4 r3,4 1 correlation between item 1 and 2; r1,3 is the
correlation between item 1 and item 3, and
so on.
Example alpha calculation
Item 1 Item 2 Item 3 Item 4
Item 1 1 0,3 0,4 0,32
Item 2 0.3 1 0,35 0,42
Item 3 0.4 0,35 1 0,28
Item 4 0,32 0,42 0,28 1
4 0,34
2
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 x 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑖𝑛𝑡𝑒𝑟𝑖𝑡𝑒𝑚𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠 = 0.74
𝑎𝑙𝑝h𝑎=
𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑎𝑡𝑟𝑖𝑥 (𝑖𝑛𝑐𝑙𝑢𝑑𝑖𝑛𝑔 1 𝑠)
7,44
2
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 x 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑖𝑛𝑡𝑒𝑟𝑖𝑡𝑒𝑚𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠
𝑎𝑙𝑝h𝑎=
Alpha depends on two factors: 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑎𝑡𝑟𝑖𝑥 (𝑖𝑛𝑐𝑙𝑢𝑑𝑖𝑛𝑔 1 𝑠)
• Number of items in the scale

• Average item correlations
Therefore, to increase alpha, we need to:

• Add more items of similar quality (make a longer scale) AND/OR
• Increase average inter-item correlations
• Delete items with low correlations with the rest of the items
• Replace low-correlated items with higher-correlated ones
Do we always want to increase alpha?
Do we always want to increase alpha?
• First, what does a large alpha mean?
• Average inter-item correlations are high AND/OR
• You have a long scale.
• What does large alpha not mean?

• That items measure what they are supposed to measure well/completely.
• That items measure what they are supposed to measure.
• That items are homogenous/unidimensional /“hang together well”)
Three highly correlated items measuring a
small aspect of extraversion
A very substantial coefficient
alpha.
Item 1
Extraversion
Item 1
A large alpha does not necessarily mean that items measure what they are supposed to measure well/completely.
Three highly correlated items that do not
measure Extraversion
A very substantial coefficient
alpha.
Item 1
Item 1
Extraversion
A large alpha does not necessarily mean that items measure what they are supposed to measure.
Three uncorrelated items, each measuring a
different aspect of extraversion
Coefficient alpha is zero.
Item 1 Item 2
Extraversion
Item 3
Likewise, a small alpha does not necessarily mean measurement is poor.

Alpha and scale homogeneity/unidimensionality
Question: Can we say alpha of .8 indicate that items measure the same
thing (that they are homogenous)?
Answer: No. alpha is not a measure of whether a scale is homogenous
or not. Alpha assumes that it is homogenous.
• Medium correlations among all items will give the same alpha as a
mix of low and high correlations.
• We need factor analysis to determine whether a scale is homogenous
or not (before computing alpha).  Next topic!
A large alpha does not necessarily mean that items are homogenous/unidimensional /“hang together well”)
(See next few slides).
A homogenous scale
Cronbach`s alpha = 0.81
Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item10
Item 1 1 0,3 0,3 0,3 0,3 0,3 0,3 0,3 0,3 0,3
Item 2 0,3 1 0,3 0,3 0,3 0,3 0,3 0,3 0,3 0,3
Item 3 0,3 0,3 1 0,3 0,3 0,3 0,3 0,3 0,3 0,3
Item 4 0,3 0,3 0,3 1 0,3 0,3 0,3 0,3 0,3 0,3
Item 5 0,3 0,3 0,3 0,3 1 0,3 0,3 0,3 0,3 0,3
Item 6 0,3 0,3 0,3 0,3 0,3 1 0,3 0,3 0,3 0,3
Item 7 0,3 0,3 0,3 0,3 0,3 0,3 1 0,3 0,3 0,3
Item 8 0,3 0,3 0,3 0,3 0,3 0,3 0,3 1 0,3 0,3
Item 9 0,3 0,3 0,3 0,3 0,3 0,3 0,3 0,3 1 0,3
Item10 0,3 0,3 0,3 0,3 0,3 0,3 0,3 0,3 0,3 1
A heterogenous (two-dimensional) scale
Item 1 1 0,6 0,6 0,7 0,7 0 0 0 0 0
Item 2 0,6 1 0,6 0,7 0,7 0 0 0 0 0
Item 3 0,6 0,6 1 0,6 0,7 0 0 0 0 0
Item 4 0,7 0,7 0,6 1 0,6 0 0 0 0 0
Item 5 0,7 0,7 0,7 0,6 1 0 0 0 0 0
Item 6 0 0 0 0 0 1 0,7 0,7 0,7 0,7
Item 7 0 0 0 0 0 0,7 1 0,7 0,7 0,7
Item 8 0 0 0 0 0 0,7 0,7 1 0,7 0,7
Item 9 0 0 0 0 0 0,7 0,7 0,7 1 0,7
Item10 0 0 0 0 0 0,7 0,7 0,7 0,7 1
A heterogenous (two-dimensional) scale
Item 1 1 0,6 0,6 0,7 0,7 0 0 0 0 0
Item 2 0,6 1 0,6 0,7 0,7 0 0 0 0 0
Item 3 0,6 0,6 1 0,6 0,7 0 0 0 0 0 A mix of small and large
Item 4 0,7 0,7 0,6 1 0,6 0 0 0 0 0 inter-item correlations
0,7 0,7 0,7 0,6 1 0 0 0 0 0
Item 5
Item 6 0 0 0 0 0 1 0,7 0,7 0,7 0,7
can give a large alpha
Item 7 0 0 0 0 0 0,7 1 0,7 0,7 0,7 estimate.
Item 8 0 0 0 0 0 0,7 0,7 1 0,7 0,7
Item 9 0 0 0 0 0 0,7 0,7 0,7 1 0,7
Item10 0 0 0 0 0 0,7 0,7 0,7 0,7 1
 Large alpha does not
guarantee that the
scale is homogeneous.
 When estimating
alpha, we assume that
We need to conduct factor analysis to see if the scale is the scale is
homogenous/unidimensional, and more broadly, how many factors homogenous.
(constructs) the items in a scale are measuring.

May 7 2024

Uploaded by

Copyright:

Available Formats

May 7 2024

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

May 7 2024

Uploaded by

Copyright:

Available Formats

Reliability

This is one of the most important uses of reliability theory: it allows

• The original development of reliability theory was to estimate the

• Reliability theory is developed so that correlations between observed

Example: Observed correlation between Neuroticism and Depression scores

Example: Observed correlation between Neuroticism and Depression scores

• Practically, where SD is the between-subject standard deviation of observed scores.

We can use SEM to draw boundaries (confidence interval) around our

General formula for CIs: Observed Score ± z*SEM

Our point- Value of z depends on the

Interpretation: If the person take this test infinite number of times

• SEM = 15* = 8.22

• SEM = 15* = 8.22

Solution: 110 ± 2.58*8.22  99% CI = [88.8, 131.2]

z-value for 99% SEM that is

• SEM = 15* = 8.22

Solution: 110 ± 2.58*8.22  99% CI = [88.8, 131.2]

z-value for 99% SEM that is

• SEM = 15* = 6.71

Solution: 110 ± 2.58*6.71  99% CI = [92.7, 127.3]

z-value for 99% SEM that is

• SEM = 15* = 6.71

• SEM = 15* = 4.24

z-value for 99% SEM that is

When reliability = 1.00, SEM becomes zero.

z-value for 99% SEM that is

Item 1 Item 2 Item 3 Item 4

• Number of items in the scale

Therefore, to increase alpha, we need to:

• What does large alpha not mean?

Likewise, a small alpha does not necessarily mean measurement is poor.

Cronbach`s alpha = 0.81

Cronbach`s alpha = 0.81

Cronbach`s alpha = 0.81

You might also like