Olympics Project
Olympics Project
Figure 1: Scatterplot
Gabriel Falsini Ap statistics 1st 09/2/2022
1. Describe the relationship between the variables in terms of direction, form, and strength.
The correlation between the Olympic year and the winning distance shows a positive,
nonlinear, moderate, relationship with one possible outlier (1968, 8.900).
2. Is a linear model appropriate for the relationship between these two variables? Discuss both the
scatterplot (Fig. 1) and the residual plot (Fig. 2) to answer this question.
Although there is a moderate correlation between the Olympic year and the winning distance,
both the scatter plot and the residual plot points exhibit a slight curved shape indicating that a
linear model may not be the best fit for this data set. Also since the r value is moderate in
strength (r= 0.564), it could be an indication that a linear model may not be the best fit.
Although it can be noted that in the residual plot, nearly half of the data is found over and
under the line, which is a good indication for a linear model.
3. Write the equation for the least-squares regression line. Define variables in context!
Ŷ = -8.39 + 0.00843x Ŷ: Winning Distance (m), X: Olympic Year
4. Write a sentence interpreting the slope of the regression line in context.
The slope equals 0.00843, so for every Olympic year increase the winning distance is predicted
to increase by a factor of 0.00843.
5. Write a sentence interpreting the y-intercept of the regression line in context.
The y intercept equals -8.39, which means that at year 0 the winning distance is predicted to
be -8.39, which is impossible to occur in real life, so it has not statistical relevance.
6. Use the least-squares regression line to predict the winning time for the Summer 2024
Olympics.
Ŷ= -8.39 + 0.00843(2024) Ŷ= 8.672m
7. Find the residual for the 1996 winning distance. Please show your work and interpret.
Ŷ = -8.39 + 0.00843(1996) = 8.436 8.500-8.436= 0.06372
The predicted winning distance, 8.436, is 0.06372m lower than the actual recorded winning
distance.
8. Find and interpret the r value in the context of the problem.
r= 0.564, this value shows that there is a positive, moderate relationship between the Olympic
year and winning distance.
Gabriel Falsini Ap statistics 1st 09/2/2022
9. Find and interpret the r^2 value in the context of the problem.
r2 = 0.3181, this means that 31.81 % can be explained by the regression model.
10. Find the means and standard deviations of the explanatory and response variables.
Explanatory Variable
-Mean Olympic Year X̅: 1984
-Standard Deviation Olympic Year: 23
Response Variable:
-Mean Winning Distance Y̅: 8.34m
-Standard Deviation Winning Distance: 0.34
11. Confirm that the slope of the regression line is given by the formula b = r * sy / sx. You must
show your work to receive credit.
0.34
b = 0.00843( ) Calculated b value = 0.0138, Actual b value = 0.00843.
23
12. Confirm that the LSRL goes through the (x ,̅ y)̅ point. You must show your work to receive credit.
Ŷ= -8.39 + 0.00843(1984) Ŷ= 8.335, resulted value is extremely close to the predicted
Ŷ
mean value of y̅ due to rounding, but since the values are so close it can be confirmed that the
LSRL goes through the points (1984, 8.34).