Sunday, June 10, 2012

Basic desiderata (Jaynes)

From E.T. Jaynes with G.L. Bretthorst (2003) Probability theory: the logical of science. Cambridge University Press, Cambridge.

Consider that we build a robot that thinks like us, except that it cannot make qualitative judgements. It can use only Aristotelian logic. What sort of fundamental desirable properties would its thinking have?

Desiderata I.   Degrees of plausibility are represented by real numbers.

Desiderata II.  Qualitative correspondence with common sense.

Desiderata III. Consistency:
  • IIIa. If a conclusion can be reasoned out in more than one way, then every plausible way must lead to the same result.
  • IIIb. The robot always takes into account all of the evidence it has relevant to a question. It does not arbitrarily ignore some of the information, basing its conclusions only on what remains. In other words, it is not ideological.
  • IIIc. The robot always represents equivalent states of knowledge by equivalent plausibility assignments. That is, if in two problems the robot's state of knowledge is the same (except perhaps for the labeling of propositions), then it must assign the same plausibilities in both.

I (HS) will note that IIIb makes this robot a Bayesian, just like the rest of us.

Sunday, June 3, 2012

Will Bayesian statistics become too easy?

A Bayesian approach to statistical inference has become increasing popular since the advent of increased desktop computing power and the development of tailored software. This is a really really good thing. However, I am concerned that it may, in the not very distant future, become too easy, and too much like frequentist methods as they are currently learned and used by life science undergraduate and graduate students. I am concerned that, in order to make Bayesian methods more accessible, they will be dumbed-down --made too easy-- and their value lost.
Part of the benefit of a Bayesian approach is that it more accurately reflects how Science is done. In a nutshell, the Bayesian approach consists of
  1. Prior beliefs: ideas, knowledge, and explicit assumptions about our system, 
  2. Collection of new data.
  3. Using the new data to update our beliefs.
The result of a Bayesian analysis is not a simple yes-no, significant-not significant kind of answer, but rather a probability distribution that reflects our most informed guesses about our variable of interest.

I believe that there are two potential pitfalls in the over simplification of a Bayesian analysis. I believe that the less serious of these pitfalls concerns the results, the posterior distribution of each model parameter. Each of these distributions is really a massive collection of independent guesses at the parameters of interest, given all of our assumptions and the newly collected data. Thus the result is not "an answer" but rather thousands of answers, with some answers more likely than others. In our efforts to satisfy ourselves, editors, and readers, we may try too hard to simplify our results.

Although we may try too hard to simplify our results, I think there is a greater danger that we will try to simplify the prior knowledge and that assumptions that we start with. In my limited experience, ecologists and statisticians are very quick to fall back into the use of the "uninformative prior," as if this is somehow "unbiased." Statisticians recognize that all priors come with a point of view, so there is no such thing as an objective uninformative prior, sometimes more accurately called a reference prior. However, I see us taking the lazy route too often and using a supposedly unbiased reference prior that reduces the tendency to take seriously the literature we read. Lots of data will overwhelm a weak prior. However, it is my experience that priors derive their weakness out of our tendency to not take seriously the quantitative nature of our literature.

As evidence that Bayesian analyses can be made easy, I can point to the numerous specialized programs for population genetics and phylogenetics that are based upon Bayesian approaches. I have seen many students use these with very little notion of what they are doing.

As learning in general is essentially a Bayesian process, my fears are not too serious. Nonetheless, ecologists need to take their priors seriously. Statisticians can help by encouraging us to make our beliefs both informed and explicit. In the end, it will only strengthen our science.