I produced this plot and regression line in R and I thought my results were quite odd. Is the relationship of the correlation determined by how steep the regression line is? So in this case it isn't very steep, so am I fair to assume it's a weak relationship? I also wondered about my regression line, since a lot of the data is way below the line, could my regression line be incorrect?
2 Answers
Yes, something is off here. A regression line always passes through the middle of your data (the average of your x and the average of your y form a point on the line), so your line seems too high compared with the rest of your data.
The correlation is proportional to the slope of your regression line (m = r*sy/sx where sy and sx are your standard deviations for y and x, respectively), but you can't tell correlation just by looking at the line. Consider the data (1, 0), (2, 0), (3, 0), ... The best fit line will be y=0, which is perfectly horizontal (the slope is 0) yet has a perfect correlation (r=1).
I would run your regression again; make sure you are including all points. If you have a number of repeated points above the line it's possible that this is correct, but I doubt that's the case.
-
$\begingroup$ To confirm, my regression line was incorrect $\endgroup$– jn025Commented Jun 2, 2014 at 7:33
The line is definitely incorrect. In R if you want to do a normal linear model (regression) use this: Let y be the dependent variable and x be the independent variable(s)
fit = lm(y ~ x)
Then if you want to check for normality or to see how "good" of a model is you can call
plot(fit)
which will show you some graphs of the analysis of the model. You can also call
summary(fit)
to give you some of the raw data