Remarks on R2¶
R2 depends on the variance on the variance of the predictors¶
Quoting from Shalizi[^1]
Assuming a true linear model
$$ Y = aX + \epsilon$$
and assuming we know \(a\) exactly.
The variance of Y will be \(a^2\mathbb{V}[X] + \mathbb{V}[\epsilon]\).
So \(R^2 = \frac{a^2\mathbb{V}[X]}{a^2\mathbb{V}[X] + \mathbb{V}[\epsilon]}\)
This goes to 0 as \(\mathbb{V}[X] \rightarrow 0\) and it goes to 1 as \(\mathbb{V}[X] \rightarrow \infty\). "It thus has little to do with the quality of the fit, and a lot to do with how spread out the predictor variable is. Notice also how easy it is to get a high \(R^2\) even when the true model is not linear!"
Below a quick comparison between two linear relationships, one with much higher variance than the other in the predictor.
Added a different constant for better display in plot.