I was talking with a student about independence the other day and realized that the student was thinking of independence as being unrelated. In usual speech, we probably think of two outcomes as independent as being that they are not connected. For example, we say that the American states won their independence from England when they broke the governing ties with England.
However, statistical independence should be viewed in a different way. It refers to events A and B so that A provides no information on whether the event B occurs or not. If A and B (remember, these are sets) have no overlap (mutually exclusive or disjoint), then if you knew that the outcome was in A, then it would be impossible that the outcome is in B. Thus, the event A is providing information regarding event B and these are not statistically independent. Similarly, if I knew that the outcome was not in A (i.e., the outcome is in A'), then we know that B is a larger portion of the remaining possible outcomes. Again, that the outcome is in A' is giving information whether the outcome is in event B.
We thus draw the conclusion that independent events must overlap. In fact, they must overlap in a very significant way. Suppose that A and B are independent and that P(A)=0.25 while P(B)=0.4. By the definition of independence, we must have P(A∩B)=P(A)P(B)=(0.25)(0.4)=0.1. To make this concrete, imagine that the sample space Ω has 100 possible, equally likely outcomes. Then A includes 25 outcomes and B includes 40 outcomes. Our calculation then requires that the intersection A∩B includes 10 outcomes.
Now, notice what this means about conditional probabilities. If we restrict our attention to the set A (i.e., we are given that event A has occurred), then the newly restricted outcome space has 25 outcomes. If we ask the probability that B occurs given this information, we know that 10 of these outcomes belong to B. So P(B|A)=10/25=0.4. Voila! This is exactly the same as P(B). Similarly, if we are given that event B has occured, then 10 of the 40 outcomes available belong to A so that P(A|B) = 10/40=0.25=P(A).
How do we summarize this idea? Well, if events A and B are independent, then each event must have a restricted but proportional representation of the other independent event. One way that we can do this is imagine that A and B are at right angles and overlap. For the above example, we can arrange the 100 items into 5 rows and 20 columns.
Then A might represent the first two columns (5×2=10) while B represents the first two rows (2×20=40). Being given information that A occurs collapses the larger picture into a reduced picture consisting of only two columns. However, the fraction of rows represented by B remains the same and P(B|A)=P(B).
This idea expands to more than two events. To have three independent events A, B, and C, we must imagine a three-dimensional grid corresponding to the outcome space so that A represents a simple division in one directions, B represents a second direction, and C represents the third direction. Conditional probabilities given event A corresponds to collapsing the space in A's direction. But the B and C fractions of A remain exactly in the same proportion as they were originally.