If you’ve been following my Twitter stream, you have probably
seen that I’m doing some reading and study on Bayesian statistics
lately. For a variety of reasons, I find the Bayesian model of
statistics quite compelling and am hoping to be able to use it in some
of my research.
Traditional statistics, encapsulating well-known methods such as
t-tests, ANOVA, etc. are from the frequentist school of statistical
thought. The basic idea of frequentist statistics is that the world is
described by parameters that are fixed and unknown.
These parameters can be all manner of things — the rotation rate of the
earth, the average life span of a naked mole rat, or the average number
of kittens in a litter of cats. It is rare that we can have access to
the entire population of interest (e.g. all mature female cats) to be
able to directly measure the parameter, so we estimate parameters by
taking random samples from the population, computing some statistic over
the sample, and using that as our estimate of the population parameter.
Since these parameters are unknown, we do not know their exact values.
Since they are fixed, however, we cannot discuss them in
probabilistic terms. Probabilistic reasoning only applies to random
variables, and parameters are not random — we just don’t know what their
values are. Probabilities, expected values, etc. are only meaningful in
the context of the outcome of multiple repeated random experiments drawn
from the population.
The Bayesian says, “Who cares?”. Bayesian statistics applies
probabilistic methods and reasoning directly to the parameters. This
doesn’t necessarily mean that the Bayesian thinks the world is really
random, though. It turns out that we can use probabilities not only to
express the chance that something will occur, but we can also use them
to express the extent to which we believe something and the math all
still works. So we can use the algebra of probabilities to quantify and
describe how much we believe various propositions, such as “the average
number of kittens per litter is 47”.
One of the fundamental differences, therefore, is that the
frequentist can only apply probabilities to the act of repeating an
experiment. The Bayesian can apply probabilities directly to their
knowledge of the world. There are other important differences as well —
frequentistic statistics is primarily concerned with testing and
falsifying hypotheses, while Bayesian focuses more on determining which
of several competing models or hypotheses is most likely to be true —
but those differences play less of a role in what I find compelling
about Bayesian statistics, and it is also possible to apply
falsificationist principles in a Bayesian framework.