What is Heteroscedasticity?
One of the crucial assumption of linear regression models is that the residual or error term needs to have constant variance and this assumption is known as Homoscedasticity. However, when the residual variance is not constant we call it a case of Heteroscedasticity. Heteroscedasticity refers to the situation in which the variability of the target variable is unequal across the range of values of the explanatory variable.
What are the possible impacts of Heteroscedasticity on coefficient estimates?
Presence of heteroscedasticity does not make the OLS estimates biased. However, they are no longer minimum variance or the optimal or the best estimator. In other words we cannot call them BLUE (Best Linear Unbiased Estimator). Additionally, the standard errors computed for the estimators will not be reliable and this in turn can affect the confidence intervals that use those standard errors.
How can we identify presence of heteroscedasticity in our model?
Heteroscedasticity can be identified either through a formal statistical hypothesis test or through visual inspection of residual plot.
Visual Inspection :
A scatterplot of these variables will often create a fan like or a cone like shape, as the scatter plot of the dependent variable either widens or narrows as the value of the independent variable increases. If we cannot identify any pattern from the scatter plot then we can say that there is no visual evidence of heteroscedasticity in the model
Breusch Pagan Test :
The most well known tests for presence of non-constant variance of residual is Breusch Pagan test. The null and alternative hypothesis for the test are as follows –
Null hypothesis : The error variances are all equal.
Alternate hypothesis : The error variances are not equal.
For a 5% level of significance p value < 0.05 indicates that the null hypothesis can be rejected and that there is presence of heteroscedasticity in the model.
White Test :
The null and alternative hypothesis for White test is similar to that of Breusch Pagan test.
Null hypothesis : The error variances are all equal.
Alternate hypothesis : The error variances are not equal.
White test is recommended to be used for larger samples. White test is pretty similar to the Breusch-Pagan Test, the only exception being in addition to the explanatory variables x1, …, xk the squared terms of the independent variables i.e. x1^2, …, xk^2 as well as the interaction term xixj for all i ≠ j are included.
How can we remediate the presence of heteroscedasticity?
As mentioned earlier in presence of heteroscedastic residuals the ordinary least square estimates no longer has minimum variance. However, we can still continue with our regression model if we can address the issue of incorrect standard errors in order to make our interval estimates and hypothesis tests valid. We can do this by using robust standard errors. The robust standard errors addresses the issue of having incorrect interval estimates due to erroneous standard error. However, even with robust standard error the estimates will no longer be minimum variance but we can be okay with it if we have a large enough sample.
