# Linear Regression - $y = f(\boldsymbol x, \boldsymbol\beta)$ - _Ordinary Least Square_ method (OLS) - By setting partial derivative to zero, $\hat y = \hat\beta_0 + \hat\beta_1x$ - In the case of straight line, residuals should be randomly distributed around $0$ - _Coefficient of determination_ $R^2$ measures how much of the variability of $y$ has been accounted by the model. - Assumptions - Normality: residuals are randomly distributed - Homoscedasticity: residuals have constant variance, use Q-Q plot for residuals - Independence - No outliers - Box-Cox transformation: transform the dependent variable and stabilize its variance to make it normally distributed. - _Multicollinearity_ - Measured by _Variance Inflation Factor_ (VIF) - Can be dealt with using Ridge and Lasso regressions, which penalizes overfitting.