First, for any random variable Y with expected value E[Y]=μ

_{y}and variance Var(Y)=σ

_{y}

^{2}, if we center and rescale as W=(Y-μ

_{y})/&sigma

_{y};, then W will always have expected value E[W]=0 and variance Var(W)=1. But it is not the case that W behaves like a standard normal distribution unless Y is somehow special. However, in special cases, when Y can be represented in terms of a sum of many independent and identically distributed random variables, then the central limit theorem can help us.

In particular, if Y is the sample mean of a random sample X

_{1}, X

_{2}, ..., X

_{n}, then Y=(X

_{1}+...+X

_{n})/n. Suppose that each X

_{i}has expected value E[X

_{i}]=μ

_{x}and variance Var(X

_{i})=σ

_{x}

^{2}. Then we know that μ

_{y}=μ

_{x}and σ

_{y}

^{2}=σ

_{x}

^{2}/n. Our centered and rescaled version of Y can be expressed in terms of the parameters for X: W=(Y-μ

_{y})/σ

_{y}= (Y-μ

_{x})/(σ

_{x}/√n). Because Y can be defined in terms of a constant times the sum of i.i.d. random variables, W has an approximately standard normal distribution.

As a second example, suppose that Y is the sum of a random sample X

_{1}, X

_{2}, ..., X

_{n}, with Y=X

_{1}+...+X

_{n}. Suppose that each X

_{i}has expected value E[X

_{i}]=μ

_{x}and variance Var(X

_{i})=σ

_{x}

^{2}. Then we know that μ

_{y}=nμ

_{x}and σ

_{y}

^{2}=nσ

_{x}

^{2}. Our centered and rescaled version of Y can be expressed in terms of the parameters for X: W=(Y-μ

_{y})/σ

_{y}= (Y-nμ

_{x})/((√n)σ

_{x}). Again, since Y can be defined in terms of a constant times the sum of i.i.d. random variables, W has an approximately standard normal distribution.

Quite a few of our known distributions can be described as a sum of i.i.d. random variables. The most famous is the binomial distribution. Suppose that Y has a binomial distribution with n trials and probability p. Then we can think of Y as the sum of n i.i.d. Bernoulli random variables X

_{1},...,X

_{n}, each with parameter p. Then μ

_{x}=E[X

_{i}]=p and σ

_{x}

^{2}=p(1-p), leading to μ

_{y}=np and σ

_{y}

^{2}=np(1-p). Consequently, W=(X-np)/√(np(1-p)) has an approximately normal distribution. In this example, we usually find that we need np≥5 and n(1-p)≥5 for the approximation to be good.

Other random variables that can be represented as a sum of simpler i.i.d. random variables. The Poisson distribution for large value λ can be rewritten as the sum of many smaller Poisson RVs with small values of λ (X

_{1},...,X

_{n}each Poisson with rate λ/n). The Gamma distribution (including Chi-Square) can similarly be written as a sum of many simpler Gamma (or Chi-Square) random variables. In each of these cases, a centered and rescaled version of the random variable W=(Y-μ

_{y})/σ

_{y}will be approximately a standard normal distribution.

## No comments:

Post a Comment