# Linear Regression
- $y = f(\boldsymbol x, \boldsymbol\beta)$
- _Ordinary Least Square_ method (OLS)
- By setting partial derivative to zero, $\hat y = \hat\beta_0 + \hat\beta_1x$
- In the case of straight line, residuals should be randomly distributed
around $0$
- _Coefficient of determination_ $R^2$ measures how much of the variability of
$y$ has been accounted by the model.
- Assumptions
- Normality: residuals are randomly distributed
- Homoscedasticity: residuals have constant variance, use Q-Q plot for
residuals
- Independence
- No outliers
- Box-Cox transformation: transform the dependent variable and stabilize its
variance to make it normally distributed.
- _Multicollinearity_
- Measured by _Variance Inflation Factor_ (VIF)
- Can be dealt with using Ridge and Lasso regressions, which penalizes
overfitting.