0% found this document useful (0 votes)
77 views

Olympics Project

The document summarizes the results of a linear regression analysis examining the relationship between the Olympic year and the winning distance in the men's long jump event. A moderate positive correlation was found, though the scatterplot and residual plot suggest a nonlinear relationship. The least squares regression line predicts that for every year increase, the winning distance increases by 0.00843 meters. However, the y-intercept of -8.39 meters is not meaningful. The regression model explains about 32% of the variation in winning distances.

Uploaded by

api-484189309
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views

Olympics Project

The document summarizes the results of a linear regression analysis examining the relationship between the Olympic year and the winning distance in the men's long jump event. A moderate positive correlation was found, though the scatterplot and residual plot suggest a nonlinear relationship. The least squares regression line predicts that for every year increase, the winning distance increases by 0.00843 meters. However, the y-intercept of -8.39 meters is not meaningful. The regression model explains about 32% of the variation in winning distances.

Uploaded by

api-484189309
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Gabriel Falsini Ap statistics 1st 09/2/2022

Bivariate Data Analysis Olympics Project


Men’s Long Jump
Year Winning Distance (m)
1948 7.825
1952 7.570
1956 7.830
1960 8.120
1964 8.070
1968 8.900
1972 8.240
1976 8.350
1980 8.540
1984 8.540
1988 8.720
1992 8.670
1996 8.500
2000 8.550
2004 8.590
2008 8.340
2012 8.310
2016 8.380
2020 8.410

Figure 1: Scatterplot
Gabriel Falsini Ap statistics 1st 09/2/2022

Figure 2: Residual Plot

1. Describe the relationship between the variables in terms of direction, form, and strength.
The correlation between the Olympic year and the winning distance shows a positive,
nonlinear, moderate, relationship with one possible outlier (1968, 8.900).
2. Is a linear model appropriate for the relationship between these two variables? Discuss both the
scatterplot (Fig. 1) and the residual plot (Fig. 2) to answer this question.
Although there is a moderate correlation between the Olympic year and the winning distance,
both the scatter plot and the residual plot points exhibit a slight curved shape indicating that a
linear model may not be the best fit for this data set. Also since the r value is moderate in
strength (r= 0.564), it could be an indication that a linear model may not be the best fit.
Although it can be noted that in the residual plot, nearly half of the data is found over and
under the line, which is a good indication for a linear model.
3. Write the equation for the least-squares regression line. Define variables in context!
Ŷ = -8.39 + 0.00843x Ŷ: Winning Distance (m), X: Olympic Year
4. Write a sentence interpreting the slope of the regression line in context.
The slope equals 0.00843, so for every Olympic year increase the winning distance is predicted
to increase by a factor of 0.00843.
5. Write a sentence interpreting the y-intercept of the regression line in context.
The y intercept equals -8.39, which means that at year 0 the winning distance is predicted to
be -8.39, which is impossible to occur in real life, so it has not statistical relevance.
6. Use the least-squares regression line to predict the winning time for the Summer 2024
Olympics.
Ŷ= -8.39 + 0.00843(2024) Ŷ= 8.672m
7. Find the residual for the 1996 winning distance. Please show your work and interpret.
Ŷ = -8.39 + 0.00843(1996) = 8.436 8.500-8.436= 0.06372
The predicted winning distance, 8.436, is 0.06372m lower than the actual recorded winning
distance.
8. Find and interpret the r value in the context of the problem.
r= 0.564, this value shows that there is a positive, moderate relationship between the Olympic
year and winning distance.
Gabriel Falsini Ap statistics 1st 09/2/2022

9. Find and interpret the r^2 value in the context of the problem.
r2 = 0.3181, this means that 31.81 % can be explained by the regression model.
10. Find the means and standard deviations of the explanatory and response variables.
Explanatory Variable
-Mean Olympic Year X̅: 1984
-Standard Deviation Olympic Year: 23
Response Variable:
-Mean Winning Distance Y̅: 8.34m
-Standard Deviation Winning Distance: 0.34
11. Confirm that the slope of the regression line is given by the formula b = r * sy / sx. You must
show your work to receive credit.
0.34
b = 0.00843( ) Calculated b value = 0.0138, Actual b value = 0.00843.
23
12. Confirm that the LSRL goes through the (x ,̅ y)̅ point. You must show your work to receive credit.
Ŷ= -8.39 + 0.00843(1984) Ŷ= 8.335, resulted value is extremely close to the predicted
Ŷ
mean value of y̅ due to rounding, but since the values are so close it can be confirmed that the
LSRL goes through the points (1984, 8.34).

You might also like