How does the correlation coefficient differ from regression slope?

  • I would have expected the correlation coefficient to be the same as a regression slope (beta), however having just compared the two, they are different. How do they differ - what different information do they give?

    if they are normalized, they are the same. but think of what happen when you make change of units...

    I think the top scoring answers to this Q (and maybe even my A to it where I show that the correlation coefficient can be seen as the absolute value of the geometric mean of the two slopes we obtain if we regress y on x and x on y, respectively) are also relevant here

  • Macro

    Macro Correct answer

    8 years ago

    Assuming you're talking about a simple regression model $$Y_i = \alpha + \beta X_i + \varepsilon_i$$ estimated by least squares, we know from wikipedia that $$ \hat {\beta} = {\rm cor}(Y_i, X_i) \cdot \frac{ {\rm SD}(Y_i) }{ {\rm SD}(X_i) } $$ Therefore the two only coincide when ${\rm SD}(Y_i) = {\rm SD}(X_i)$. That is, they only coincide when the two variables are on the same scale, in some sense. The most common way of achieving this is through standardization, as indicated by @gung.

    The two, in some sense give you the same information - they each tell you the strength of the linear relationship between $X_i$ and $Y_i$. But, they do each give you distinct information (except, of course, when they are exactly the same):

    • The correlation gives you a bounded measurement that can be interpreted independently of the scale of the two variables. The closer the estimated correlation is to $\pm 1$, the closer the two are to a perfect linear relationship. The regression slope, in isolation, does not tell you that piece of information.

    • The regression slope gives a useful quantity interpreted as the estimated change in the expected value of $Y_i$ for a given value of $X_i$. Specifically, $\hat \beta$ tells you the change in the expected value of $Y_i$ corresponding to a 1-unit increase in $X_i$. This information can not be deduced from the correlation coefficient alone.

    As a corollary of this answer, notice that regressing x against y is not the inverse of regressing y against x !

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM