7 Regression
What is Regression?
Regression analysis is a statistical technique used to understand the relationship between a dependent variable and one or more independent variables. It helps in predicting outcomes, identifying trends, and making informed decisions based on data.
7.0.1 Objectives of Regression Analysis
- To understand relationships between variables.
- To predict the value of a dependent variable based on independent variables.
- To quantify the strength and direction of associations.
- To identify influential factors affecting outcomes.
7.0.2 Ordinary Least Squares (OLS) Regression
OLS regression is the most widely used method for estimating relationships in linear regression. It minimizes the sum of squared differences between observed and predicted values.
The Simple Linear Regression model is expressed as:
\[ Y = \beta_0 + \beta_1X + \varepsilon \]
where:
- \(Y\) = dependent variable
- \(X\) = independent variable
- \(\beta_0\) = intercept
- \(\beta_1\) = slope coefficient
- \(\varepsilon\) = error term
For multiple independent variables, Multiple Linear Regression extends this to:
\[ Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + \dots + \beta_nX_n + \varepsilon \]
7.0.3 Key Assumptions of OLS Regression
- Linearity – Relationship between independent and dependent variables is linear.
- Independence – Observations are independent of each other.
- Homoscedasticity – Variance of residuals remains constant across all values.
- Normality of Residuals – The residuals follow a normal distribution.
- No Multicollinearity – Independent variables are not highly correlated.