Monday, March 31, 2008

Central Limit Theorem

The central limit theorem allows to use a standard normal random variable as an approximating proxy for a properly centered and rescaled sample mean.

First, for any random variable Y with expected value E[Y]=μy and variance Var(Y)=σy2, if we center and rescale as W=(Y-μy)/&sigmay;, then W will always have expected value E[W]=0 and variance Var(W)=1. But it is not the case that W behaves like a standard normal distribution unless Y is somehow special. However, in special cases, when Y can be represented in terms of a sum of many independent and identically distributed random variables, then the central limit theorem can help us.

In particular, if Y is the sample mean of a random sample X1, X2, ..., Xn, then Y=(X1+...+Xn)/n. Suppose that each Xi has expected value E[Xi]=μx and variance Var(Xi)=σx2. Then we know that μyx and σy2x2/n. Our centered and rescaled version of Y can be expressed in terms of the parameters for X: W=(Y-μy)/σy = (Y-μx)/(σx/√n). Because Y can be defined in terms of a constant times the sum of i.i.d. random variables, W has an approximately standard normal distribution.

As a second example, suppose that Y is the sum of a random sample X1, X2, ..., Xn, with Y=X1+...+Xn. Suppose that each Xi has expected value E[Xi]=μx and variance Var(Xi)=σx2. Then we know that μy=nμx and σy2=nσx2. Our centered and rescaled version of Y can be expressed in terms of the parameters for X: W=(Y-μy)/σy = (Y-nμx)/((√n)σx). Again, since Y can be defined in terms of a constant times the sum of i.i.d. random variables, W has an approximately standard normal distribution.

Quite a few of our known distributions can be described as a sum of i.i.d. random variables. The most famous is the binomial distribution. Suppose that Y has a binomial distribution with n trials and probability p. Then we can think of Y as the sum of n i.i.d. Bernoulli random variables X1,...,Xn, each with parameter p. Then μx=E[Xi]=p and σx2=p(1-p), leading to μy=np and σy2=np(1-p). Consequently, W=(X-np)/√(np(1-p)) has an approximately normal distribution. In this example, we usually find that we need np≥5 and n(1-p)≥5 for the approximation to be good.

Other random variables that can be represented as a sum of simpler i.i.d. random variables. The Poisson distribution for large value λ can be rewritten as the sum of many smaller Poisson RVs with small values of λ (X1,...,Xn each Poisson with rate λ/n). The Gamma distribution (including Chi-Square) can similarly be written as a sum of many simpler Gamma (or Chi-Square) random variables. In each of these cases, a centered and rescaled version of the random variable W=(Y-μy)/σy will be approximately a standard normal distribution.