Aki’s post on the tenth anniversary of the loo package reminded me that the first edition of Bayesian Data Analysis came out 30 years ago!
The starting point was a course on Bayesian statistics that John Carlin taught, and in which I was a student, back in 1987. In 1991 I started writing the book, using John’s lecture notes as the basis of chapters 1-4, then putting together the all-important chapter 5 on hierarchical models based on three examples, one from a paper by Dempster, one from a paper by Rubin (the famous 8 schools), and one from John. The biggest efforts went into chapters 6 and 7 (which roughly correspond to chapters 6, 7, and 8 in the third edition), where I did my best to make sense of everything that Rubin had written and told me about model checking and Bayesian design-based inference. These chapters included a lot of new things too–new to me, at least!–including Bayesian analysis of surveys and experiments, connections between truncation and censoring models (see section 2 of my 2004 paper on parameterization and Bayesian modeling), and some other things. It was really satisfying to derive all these things from scratch. My only regret in chapter 6 is not more clearly describing the different ways that a classical p-value can be generalized (see here), and my big regret in the book as a whole is on overemphasis on noninformative priors and a lack of respect for the importance of prior information. The second and third editions of the book have this problem too. We’re trying to do better in our forthcoming Bayesian Workflow book.
It was also fun to write all the things in chapter 4 of BDA, including connections between Bayesian and non-Bayesian methods. I think our work here is solid, and it goes beyond the simple idea that Bayesian inferences should have good frequency properties. And I was frustrated at the usual hand-waving treatment of Bayesian asymptotics so I went to the trouble of deriving the limiting normality of the posterior distribution–along with examples where that result does not hold–which went into appendix B. Later I learned that this proof was all over the place (for example, in the book by DeGroot), but it was good for me to have had to figure it all out for myself. Even though most students probably don’t read that appendix, I think the understanding that comes from it suffuses the rest of the book. I’m also proud of chapter 1, where we introduced probability from an empirical perspective. In later editions we added more examples and I think that chapter is even better.
OK, I won’t go through all the chapters here. It ended up taking John and me about three years to finish the book, and at the end we brought in Hal Stern at the end as a closer. I finished all of it including the index at the beginning of February, 1995, and it came out that summer–it made a big splash at the Joint Statistical Meetings in August. I don’t recall the official publication date, but it was just about 30 years ago.
The book’s title worked out fine. I wasn’t thrilled with the word “Bayesian” in the title, partly because I found many Bayesians to be annoying–I still have unpleasant memories of the 1991 Valencia conference–also I didn’t like Bayesian in the title because it seemed like a brand name, not really descriptive. We were playing around with alternative titles such as “Statistics using conditional probability,” but that seemed too obscure. My most useful big idea regarding the title was calling it Bayesian Data Analysis rather than Bayesian Inference or Bayesian Statistics. “Inference” seemed too specific, because–as we explain right on the first page of the book and demonstrate throughout–data analysis is about modeling and model checking, not just inference. And “Statistics” seemed too general, because statistics includes not just data analysis but also design, measurement, and decision making, and our book had very little on those other topics. So Bayesian Data Analysis seemed just right. And still is, thirty years on.
BDA is great! Its ideas are a big part of how I approach data analysis in general—even when I’m using “frequentist” models.
I’m very excited about the new Bayesian Workflow book (is there a place to order it yet?). If I may make a small suggestion: maybe use a font that is not “Computer Modern” (aka “Latex default font”). This font is okay, but (to me at least) it looks too thin on digital screens and it’s used way too much nowadays.
Frequent:
I have no idea what font the book is in! This is the sort of thing that Aki would know, but I think he’s on vacation now. I’ll post more on the workflow book when it’s closer to being available.
The font is Computer Modern, which is the default for LaTeX. Your regression book with Jennifer is in the same font and makes the even bigger mistake of wide text with a small font and small interline spacing, which makes it particularly challenging to read.
It’s as horrible as font as the default page design is a horrible design (e.g., garishly large section headers, too much vertical white space, lack of alignment of section headers, and many other flaws you wouldn’t make if you learned basic page design). The only thing that’s right is the page width, and people almost always change that to disastrous effect as Andrew and Jennifer did in their regression book, a sin I believe Regression and Other Stories just followed—you get a lot of text on the page, but it’s very hard to read with the long lines and tight interline spacing).
Anyone who knows anything about typography can give you ten reasons why Computer Modern is a disaster of a font. The main problem is that it has terrible color (in the typography sense of how dark it is on the page) and the risers are way too thin compared to the serifs. It looks like a students’ first attempt at font design (which it presumably was).
You see a lot more Times Roman now in papers because it’s a very narrow font (in the “em” sense—the width of the letter “m”), so you can pack in a lot of text. Same reason they use it in newspapers. It’s the standard for things like NeurIPS these days because someone finally came to their senses. The problem with Times is that it’s hard to get a good matching computer font, so that usually defaults to Computer Modern, which doesn’t match Times in basic properties like color, em-width, and x-height (height of letter “x”). Then you have fixed width (code) fonts, for which there also isn’t a good match for times. The default Courier you get by default with Computer Modern is a complete disaster of mismatched fonts (they mismatch on all important properties to make text look smooth)—its color (how much black ink per character) is a complete mismatch as is it’s x height, so there’s no way to scale it to match either way.
On the plus side, Knuth did a great job implementing standard typographical conventions into LaTeX like spacing and page breaks. He just didn’t learn the next level of design principles you’d need to design a sensible document format or font.
If you want the best font available and are willing to pay ($99 still, I believe), I’d highly recommend the Lucida family. It’s what I used for the original Stan documentation when I was pretty much the only person writing it. It’s hard to share as TeX without breaking the law because it’s proprietary (but it’s fine to share the documents with the font embedded). The best I can do for free is to use the packages mathpazo (Palatino body and math font) and sourcecodepro (decently matching fixed width “code” font). The best you can say about Computer Modern is that papers written in that font look like every other CS paper from the last 40 years.
Bob:
i don’t think the page design of our books is horrible. It’s a textbook, not a novel, and there’s value in having a lot on one page. This can make it easier to follow the math. To put it another way, you say, ” it’s very hard to read with the long lines and tight interline spacing,” but I think that the vast majority of users of our book don’t “read” it so much as “use” it.
Anyway, that’s just my take. I get that your take is different. I kinda like it that different books are formatted in different styles.
When it comes to typefaces, I think tastes differ, and I’m skeptical of the universalist tack you seem to be taking in your comment here. I just read a novel the other day–I liked it a lot, but its typeface and layout made it hard to read compared to other novels I’ve been reading lately. That said, the publisher of this book must have had some reason to prefer what they did. The natural conclusion for me is that reading experiences and tastes differ, and what works for one reader won’t always work for others.
Bob: by and large I agree. I wouldn’t say Computer Modern is horrible (this term I reserve for most uses of Comic Sans and other silly fonts), but I, too, prefer the Lucida family. I find it easier on the eyes and it has enough styles to accommodate versatile “main” text along with math, code, etc.
Andrew: Tastes do differ, but as with writing, there are widespread conventions on what constitutes better and worse typography (which is not just about fonts). It’s true that there are no universal solutions, but it’s also true that defaults can be bad. Computer Modern is popular because it is a free default, but it has noticeable flaws that other fonts don’t. I vote for choosing something that you have reasons to like, instead of settling for something that you don’t have reasons to dislike.
Nasser M. Abbasi has a comparison of several body and math fonts available for use with LaTeX here: https://12000.org/my_notes/faq/LATEX/math_fonts/index.pdf
I recommend viewing it as a PDF; my web browser renders the
(Remainder of my cut-off comment.)
… PDF preview images too large.
I wouldn’t assume that a publisher’s book design choices were made for any reasons other than expedience and economy.
My copy of BDA3 was, I think, printed on-demand even though I ordered it from the publisher. The black spine is grey relative to my other CRC titles, the pages are perfect bound (cut and glued at the edge) instead of sewn in signatures, and the paper is so thin the text on the other side of the page is legible. I considered returning it.