April 15, 2026 2:23 am

Any simple linear regression model can be presented in the form of a simplified equation as shown below

Yt =α + β*Xt + εt

Here the εt is the called the residual which is not explained by the model.More conventionally this is known as the error term.This basically captures the randomness of the omitted variables.Now this error term holds some assumptions as described below:

Zero Mean : E(εi) = 0 for all i. So,for any given X ; ε may have different values but on average it is zero.

Homoskedasticity : Var ( εi ) = σ2 which is constant. So basically, residuals would not have higher variance for higher values of X.

Normality : εi is normally distributed. As mentioned earlier, εi captures the impact of omitted variables.If we have many such variables which are minor but are independently distributed random variables, the distribution of their sum tends to be normal as the number of such variables increases.

No autocorrelation : εi is independent such that different values of εi are not correlated.

Cov( εi , εj ) = E[ { εi -E( εi )} {εj – E(εj)}] = E( εi εj) = 0

Non-stochastic X : The values of X (i.e, the explanatory variables) are same in repeated samples.

The implication of violation of these assumptions will be discussed in a separate post.

Discover more from SolutionShala

Subscribe now to keep reading and get access to the full archive.

Continue reading