Sunday, November 30, 2008

Cocktail party conversation starter (aka how to mess with statistics 101 students)

Posted by Danny Tarlow
I don't want to sound too pedantic here, so I came up with a witty title. Please don't randomly bring this up in any sort of normal conversation (ahah, get it?). The (normalized) product of two Gaussian distributions is itself a Gaussian distribution. If you're not afraid of a little algebra, you can prove it yourself by writing down the expression for the probability density function of a Gaussian random variable twice, do the multiplication, combine the exponent terms, rearrange terms, then complete the square (yes, I know that's too fast if you don't know what I'm talking about). If you take the (normalized) sum of two Gaussian distributions, you get a mixture of distributions that can have two modes, so it's certainly not Gaussian. Now here's the tricky part. If you have two Gaussian random variables
A ~ N(mu_A, sigma_A^2)
B ~ N(mu_B, sigma_B^2)
then you define random variable C to take on the value of the sum A + B, then C will be distributed according to a Gaussian distribution:
C ~ N(mu_A + mu_B, sigma_A^2 + sigma_B^2)
If instead you define random variable D to take on the value of the product A * B, then D will not be distributed normally. As an example, if A = B and mu_A = mu_B = 0 and sigma_A = sigma_B = 1, then D is distributed according to a chi-square distribution with 1 degree of freedom. The "trick" (if you want to call it that) comes from the loose wording people use when they say things like "the product of two Gaussians." In the first case, you are actually multiplying probability distributions. In the second case, you are multiplying the values of draws from probability distributions -- it's kind of subtle. Unfortunately, both interpretations are reasonable and used in practice. The first one comes up most for me, because if you have two independent beliefs about the value of a variable, then the right thing to do to combine the evidence is to multiply the distributions. The second comes up in places like multiplicative models.

No comments: