OK, I finished reading it and transcribing my thoughts. They’re the equivalent of about 20 blog entries (or one long unpublishable article) but it seemed more convenient to just put them in one place.
As I noted earlier, reading the book with pen in hand jogged loose various thoughts. . . . The book is about unexpected events (“black swans”) and the problems with statistical models such as the normal distribution that don’t allow for these rarities. From a statistical point of view, let me say that multilevel models (often built from Gaussian components) can model various black swan behavior. In particular, self-similar models can be constructed by combining scaled pieces (such as wavelets or image components) and then assigning a probability distribution over the scalings, sort of like what is done in classical spectrum analysis of 1/f noise in time series. For some interesting discussion in the context of “texture models” for images, see the chapter by Yingnian Wu in my book with Xiao-Li on applied Bayesian modeling and causal inference. (Actually, I recommend this book more generally; it has lots of great chapters in it.)
That said, I admit that my two books on statistical methods are almost entirely devoted to modeling “white swans.” My only defense here is that Bayesian methods allow us to fully explore the implications of a model, the better to improve it when we find discrepancies with data. Just as a chicken is an egg’s way of making another egg, Bayesian inference is just a theory’s way of uncovering problems with can lead to a better theory. I firmly believe that what makes Bayesian inference really work is a willingness (if not eagerness) to check fit with data and abandon and improve models often.
More on black and white
My own career is white-swan-like in that I’ve put out lots of little papers, rather than pausing for a few years like that Fermat’s last theorem guy. Years ago I remarked to my friend Seth that he’s followed the opposite pattern: by abandoning the research-grant, paper-writing treadmill and devoting himself to self-experimentation, he basically was rolling the dice and going for the big score–in Taleb’s terminology, going for that black swan.
On the other hand, you could say that in my career I’m following Taleb’s investment advice–my faculty job gives me a “floor” so that I can work on whatever I want, which sometimes seems like something little but maybe can have unlimited potential. (On page 297, Taleb talks about standing above the rat race and the pecking order; I’ve tried to do so in my own work by avoiding a treadmill of needing associates to do the research to get the funding, and needing funding to pay people.)
In any case, I’ve had a boring sort of white-swan life, growing up in the suburbs, being in school continuously since I was 4 years old (and still in school now!). In contrast, Taleb seems to have been exposed to lots of black swans, both positive and negative, in his personal life.
Chapter 2 of The Black Swan has a (fictional) description of a novelist who labors in obscurity and then has an unexpected success. This somehow reminds me of how lucky I feel that I went to college when and where I did. I started college during an economic recession, and in general all of us at MIT just had the goal of getting a good job. Not striking it rich, just getting a solid job. Nobody I knew had any thought that it might be possible to get rich. It was before stock options, and nobody knew that there was this thing called “Wall Street.” Which was fine. I worry that if I had gone to college ten years later, I would’ve felt a certain pressure to go get rich. Maybe that would’ve been fine, but I’m happy that it wasn’t really an option.
95% confidence intervals can be irrelevant, or, living in the present
On page xviii, Taleb discusses problems with social scientists’ summaries of uncertainty. This reminds me of something I sometimes tell political scientists about why I don’t trust 95% intervals: A 95% interval is wrong 1 time out of 20. If you’re studying U.S. presidential elections, it takes 80 years to have 20 elections. Enough changes in 80 years that I wouldn’t expect any particular model to fit for such a long period anyway. (Mosteller and Wallace made a similar point in their Federalist Papers book about how they don’t trust p-values less than 0.01 since there can always be unmodeled events. Saying p<0.01 is fine, but please please don't say p<0.00001 or whatever.) More generally, people (or, at least, political commentators) often live so much in the present that they forget that things can change. An instructive example here is Richard Rovere's book on Goldwater's 1964 campaign. Rovere, a respected political writer, wrote that the U.S. had a one-and-a-half-party system, with the Democrats being the full party and the Republicans the half party. Yes, Goldwater lost big and, yes, the Democrats did have twice the number of Senators and twice the number of Representatives in Congress then--but, actually, from 1950 through 1990, the Republicans won or tied every Presidential election (except 1964). Hardly the performance of a half-party. Knowing what you don’t know, and omniscience is not omnipotence
The quotes on page xix remind me of one of my favorites: “It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so” (Mark Twain?). I actually prefer the version that says, “It’s what you don’t know you don’t know that gets you into trouble.” Also Earl Weaver’s “It’s what you learn after you know it all that counts.”
On page xx, Taleb writes, “What you know cannot really hurt you.” This doesn’t sound right to me. Sometimes you know something bad is coming but you can’t dodge it. For example, consider certain diseases.
Creativity is not (yet) algorithmic
On page xxi, Taleb says how almost no great discovery came from design and planning. This reminds me about a biography of Mark Twain that I read several years ago. Apparently, Twain was always trying to create a procedure–essentially, an algorithm–to produce literature. He tried various strategies, collaborators, etc., but nothing really worked. He just had to wait for inspiration and write what came to mind.
Also on page xxi, Taleb writes “we don’t learn rules, just facts, and only facts.” This statement would surprise linguists. It’s been well demonstrated that kids learn language through rules (as can be seen, for example, from overgeneralizations such as “feets” and “teached”). More generally, folk science is strongly based on categories and natural kinds–I think Taleb is aware of this since he cites my sister’s work in his references. (A recent example of naive categorization in folk science is in the papers of Satoshi Kanazawa.)
Recognition, prevention, and saltatory growth
On page xxiii, Taleb writes that “recognition can abe quite a pump.” Yes, but recall all those scientists whose lives were shortened by two years (on average) from frustration at not receiving the Nobel Prize!
On page xxiv, “few reward acts of prevention”: I’m reminded of our health plan in grad school, which paid for catastrophic coverage but not routine dental work. A friend of mine actually had to get root canal, and eventually got the plan to pay for it, but not without a struggle.
On page 10, Taleb writes, “history does not crawl, it jumps.” This reminds me of the evidence on saltatory growth in infants (basically, babies grow length by a jump every few days; they don’t grow the same amount every day).
I was also reminded of the fractal nature of scientific revolutions–basically, at all scales (minutes, hours, days, months, years, decades, centuries, . . .), science seems to proceed by being derailed by unexpected “aha” moments. (Or, to pick up on Taleb’s themes, I can anticipate that “aha” moments will occur, I just can’t predict exactly when they will happen or what they will be.)
Liberals and conservatives
On page 16, Taleb asks “why those who favor allowing the elimination of a fetus in the mother’s womnb also oppose capital punishment” and “why those who accept abortion are supposed to be favorable to high taxation but against a strong military,” etc. First off, let me chide Taleb for deterministic thinking. Fromthe General Social Survey cumulative file, here’s the crosstab of the responses to “Abortion if woman wants for any reason” and “Favor or oppose death penalty for murder”:
40% supported abortion for any reason. Of these, 76% supported the death penalty.
60% did not support abortion under all conditions. Of these, 74% supported the death penalty.
This was the cumulative file, and I’m sure things have changed in recent years, and maybe I even made some mistake in the tabulation, but, in any case, the relation between views on these two issues is far from deterministic!
But getting back to the main question: I don’t think it’s such a mystery that various leftist views (allowing abortion, opposing capital punishment, supporting a graduated income tax, and reducing the military) are supposed to go together–nor is it a surprise that the opposite positions go together in a rightist worldview. Abortion is related to women’s rights, which has been a leftist position for a long time. Similarly, conservatives have favored harsher punishments and liberals (to use the U.S. term) have favored milder punishments for a long time also. The graduated income tax favors the have-nots rather than the have-mores, and the military is generally a conservative institution. Other combinations of views are out there, but I don’t agree with Taleb’s claim that the left-right distinction is arbitrary.
Picking pennies in front of a steamroller
On page 19, Taleb refers to the usual investment strategy (which I suppose I actually use myself) as “picking pennies in front of a steamroller.” That’s a cute phrase; did he come up with it? I’m also reminded of the famous Martingale betting system. Several years ago in a university library I came across a charming book by Maxim (of gun fame) where he went through chapter after chapter demolishing the Martingale system. (For those who don’t know, the Martingale system is to bet $1, then if you lose, bet $2, then if you lose, bet $4, etc. You’re then guaranteed to win exactly $1–or lose your entire fortune. A sort of lottery in reverse, but an eternally popular “system.”)
Throughout, Taleb talks about forecasters who aren’t so good at forecasting, picking pennies in front of steamrollers, etc. I imagine much of this can be explained by incentives. For example, those Long-Term Capital guys made tons of money, then when their system failed, I assume they didn’t actually go broke. They have an incentive to ignore those black swans, since others will pick up the tab when they fail (sort of like FEMA pays for those beachfront houses in Florida). It reminds me of the saying that I heard once (referring to Donald Trump, I believe) that what matters is not your net worth (assets minus liabilities), but the absolute value of your net worth. Being in debt for $10 million and thus being “too big to fail” is (almost) equivalent to having $10 million in the bank.
The discussion on page 112 of how Ralph Nader saved lives (mostly via seat belts in cars) reminds me of his car-bumper campaign in the 1970s. My dad subscribed to Consumer Reports then (he still does, actually, and I think reads it for pleasure–it must be one of those Depression-mentality things), and at one point they were pushing heavily for the 5-mph bumpers. Apparently there was some federal regulation about how strong car bumpers had to be, to withstand a crash of 2.5 miles per hour, or 5 miles per hour, or whatever–the standard had been 2.5 (I think), then got raised to 5, then lowered back to 2.5, and Consumer’s Union calculated (reasonably correctly, no doubt) that the 5 mph standard would, in the net, save drivers money. I naively assumed that CU was right on this. But, looking at it now, I would strongly oppose the 5 mph standard. In fact, I’d support a law forbidding such sturdy bumpers. Why? Because, as a pedestrian and cyclist, I don’t want drivers to have that sense of security. I’d rather they be scared of fender-benders and, as a consequence, stay away from me! Anyway, the point here is not to debate auto safety; it’s just an interesting example of how my own views have changed. Another example of incentives.
Three levels of conversation, or, why lunch at the faculty club might (sometimes) be more interesting than hanging out with chair-throwing traders
On page 21, Taleb compares the excitement of chair-throwing stock traders to “lunches in a drab university cafeteria with gentle-minded professors discussing the latest departmental intrigue.” This reminds me of a distinction I came up with once when talking with Dave Krantz, the idea of three levels of conversation. Level 1 is personal: spouse, kids, favorite foods, friends, gossip, etc. Level 2 is “departmental intrigue,” who’s doing what job, getting person X to do thing Y, how to get money for Z–basically, level 2 is all about money. Level 3 is impersonal things: politics, sports, research, deep thoughts, etc. When talking with Dave, I resolved to minimize level 2 conversation and focus on the far more important (and interesting) levels 1 and 3. Level 2 topics have an immediacy which puts them on the top of the conversational stack, which is why I made the special effort to put them aside. Anyway, it struck me in reading page 21 of Taleb’s book that chair-throwing stock traders have much more interesting level 2 conversations (compared with professors or even grad students), and quite possibly they have better level 1 conversations also–but I’d hope that the level 3 conversations at the university are more interesting. Being on campus, I’m used to having all sorts of good level 3 conversations, but I find these harder to come by in other settings. Probably it’s nothing to do with the depth of these other people, just that I find it easier to get into a good conversational groove with people at the university. In any case, I try (not always successfully) to keep conversations away from “the latest departmental intrigue.”
Riding the escalator to the stairmaster
The story on page 54 about the people who ride the escalator to the Stairmasters reminds me that, where I used to work, there was a guy who carried his bike up the stairs to the 4th floor. This always irritated me because it set an unfollowable example. For instance, one day I was on the elevator (taking my bike to the 3rd floor) and some guy asked me, “You ride your bike for the exercise. Why don’t you take the stairs?” (I replied that I don’t ride my bike for the exercise.)
Confirmation bias, or, shouldn’t I be reading an astrology book?
Around pages 58-59, Taleb talks about confirmation bias and recommends that we look for counterexamples to our theories. I certainly agree with this and do it all the time in my research. But what about other aspects of life? For example, I was reading The Black Swan, which I knew ahead of time would contain lots of information that I already agreed with. Should I instead read a book on astrology? In practice, I’m sure this would just confirm my (true) suspicion that astrology is false, so I’m kinda stuck.
Rare events and selection bias
The footnote on 61 reminded me of a talk I saw a couple years ago where it was said that NYC is expected to have a devastating earthquake some time in the next 2000 years.
On page 77, Taleb says that lottery players treat odds of one in a thousand and one in a million almost the same way. But . . . when they try making lottery odds lower (for example, changing from “pick 6 out of 42” to “pick 6 out of 48,” people do respond by playing less (unless the payoffs are appropriately increased). I attribute this not to savvy probability reasoning but to a human desire not to be ripped off.
On page 102 and following, Taleb discusses selection bias. I also recommend the article by Howard Wainer et al. (A Selection of Selection Anomalies); Deb Nolan and I also have a few in our Teaching Statistics book.
Then, on page 126, Taleb describes a conference he attended where his “first surprise was to discover that the military people there thought, behaved, and acted like philosophers [in the good sense of the word] . . . They thought out of the bix, like traders, except much better and without fear of introspection.” He goes on to discuss why military officers are such good skeptical thinkers. But this seems like a clear case of selection bias! The military officers who come to an academic symposium are probably an unusual bunch.
On page 118-119, there’s a discussion of how someone with a winning streak in life can think it’s skill, even if it’s just luck and selection (that the losers don’t get observed). I’d like to add another explanation, which is that people lie. Someone who tells you he won ten straight times probably actually won ten times out of fifteen. Someone who tells you he broke even probably is a big loser. Etc.
On page 125, Taleb explains why the Fat Tonys get more Nobel Prizes in medicine than the Dr. Johns. I don’t know if this is really true, but if it is, I might attribute it to the Tonys’ better social skills (i.e., helping others be happy and getting people to do what they want) more than their better ability to assess uncertainty.
Of fights and coin flips
On page 127-128, Taleb discusses the distinction between uncertainty and randomness (in my terminology, the boxer, the wrestler, and the coin flip). I’d only point out that coins and dice, while maybe not realistic representations of many sources of real-world uncertainty, do provide useful calibration. Similarly, actual objects rarely resemble “the meter” (that famous metal bar that sits, or used to sit, in Paris), but it’s helpful to have an agreed-upon length scale. We have some examples in Chapter 1 of Bayesian Data Analysis of assigning probabilities empirically (for football scores and record linkage).
Also, as discussed in our Teaching Statistics book, when teaching probability I prefer to use actual random events (e.g., sex of births) rather than artificial examples such as craps, roulette, etc., which are full of technical details (e.g., what’s the probability of spinning a “00”) that are dead-ends with no connection to any other areas of inquiry. In contrast, thinking about sex of births leads to lots of interesting probabilistic, biological, combinatorical, and evolutionary directions.
Overconfidence as the side effect of communication goals
On page 14, Taleb discusses overconfidence (as in the pathbreaking Alpert and Raiffa study). As we teach in decision theory, there’s actually an easy way to make sure that your 95% intervals are calibrated. Just apply the following rule: Every time someone asks you to make a decision, spin a spinner that has a 95% chance of returning the interval (-infinity, infinity), and a 5% chance of returning the empty set. You will be perfectly calibrated (on average). The intervals are useless, however, which points toward the fact that when people ask you for an interval, you’re inclined (for Gricean reasons if no other) to provide some information. According to Dave Krantz, much of overconfidence of probability statements can be explained by this tension between the goals of informativeness and calibration.
On page 145, Taleb discusses the fallacy of assuming that “more is better.” A lot depends here on the statistical model you’re using (or implicitly using). With least squares, overfitting is a real concern. Less so in Bayesian inference, but still it comes up with noninformative prior distributions. An important–the important–topic in Bayesian statistics is the construction of structured prior distributions that let the data speak but at the same time don’t get overwhelmed by a flood of data.
Of taxonomies and lynx
In the discussion of Mandelbrot’s work on page 269, I’d also mention his models for taxonomies, which have a simple self-similar structure without the complexities of the more familiar spatial examples. Also, the story about the problems of Gaussian models reminds me Cavan Reilly’s chapter in this book, where he fits a simple predator-prey model with about 3 parameters to the famous Canadian lynx data and gets much better predictions than the standard 11-parameter Gaussian time series models that are usually fit to those data.
On page 278, Taleb rants against statistical buzzwords such as standar deviation and correlation, and financial buzzwords such as risk. This reminds me of my rant against the misunderstood concept of “risk aversion.” I have to write this up fully sometime, but some of my rant is here.
It’s all over but the compartmentalizin’
On page 288, Taleb discusses people who compartmentalize their intellectual lives, for example the philosopher who was a trader but didn’t use his trading experiences to inform his philosophy. I noticed a similar thing about some of my collegues where I used to teach in the statistics department at Berkeley. On the one hand, they were extremely theoretical, using advanced mathematics to prove very subtle things in probability theory, often things (such as the strong law of large numbers) that had little if any practical import. But when they did applied work, they threw all this out the window–they were so afraid of using probability models that they would often resort to very crude statistical methods.
I’m only a statistician from 9 to 5
I try (and mostly succeed, I think) to have some unity in my professional life, developing theory that is relevant to my applied work. I have to admit, however, that after hours I’m like every other citizen. I trust my doctor and dentist completely, and I’ll invest my money wherever the conventional wisdom tells me to (just like the people whom Taleb disparages on page 290 of his book).
Miscellaneous sociological thoughts
Taleb’s comment on page 155 about economics being the most insular of fields reminds me of this story of the economist who said that economists are different than “anthropologists, sociologists, and public health officials” because economists believe that “everyone is fundamentally alike” [except, of course, for anthropologists, etc.]. Economists often do seem pretty credulous of arguments presented by other economists!
The reference on page 158 to dentists reminded me of the dentists named Dennis.
On page 166, Taleb disparages plans. But plans can be helpful, no? Even if they don’t work out. It usually seems to me that even a poor plan (if recognized as tentative) is better than no plan at all.
The discussion on page 171 of predicting predictions reminds me of the paradox, of sorts, that opinion polls shift predictably during presidential nominating conventions (for evidence, see here, for example), even though conventions are very conventional events, and so one’s shift in views should be (on average) anticipated.
On page 174-175, Taleb commends Poincare for not wasting time finding typos. For me, though, typo-finding is pleasant. Although I am reminded of the expression, “there’s no end to the amount of work you can put into a project after it’s done.”
The graphs on pages 186-187 have that ugly Excel look, with unecessary horizontal lines and weirdly labeled y-axes. In any case, they remind me of the game of “scatterplot charades” that I sometimes enjoy playing with a statistics class. The game goes as follows: someone displays a scatterplot–just the points, nothing more–and everyone tries to guess what’s being plotted. Then more and more of the graph is revealed–first the axis numbers, then the axis labels–until people figure it out.
I’m a little puzzled by Taleb’s claim, at the end of page 193, that “to these people amused by the apes, the idea of a being who would look down on them the way they look down on the apes cannot immediately come to their minds.” I’m amused by apes but can imagine such a superior being who would be amused by me. Why not?
On page 196, Taleb writes, “a single butterfly flapping its wings in New Delhi may be the certain cause of a hurricane in North Carolina . . .” No–there is no “the cause” (let alone, “the certain cause”). Presumably another butterfly somewhere else could’ve moved the hurricane away.
Page 198: the chance of a girl birth is 48.5%, not 50%.
On page 209, Taleb writes, “work hard, not in grunt work . . .”. I have mixed feelings here. On one hand, yes, grunt work can distract from the big projects. For example, I’m blogging and writing lots of little papers each year instead of attacking the big questions. On the other hand, these little projects are the way I get insight into the big questions. Getting in down and dirty, playing with the data and writing code, is a way that I learn.
The mention on page 210 of Pascal’s wager reminds me of the fallacy of the one-sided bet. I’m hoping that now that this fallacy has been named, people will notice it and avoid it on occasion.
The discussion on page 222 of capitalism, socialism, and attribution errors reminds me of the saying that everybody wants socialism for themselves and capitalism for everybody else (and there’s nothing more fun than spending other people’s money).
The discussion on the following page of the long tail reminds me of the conjecture about the “fat head” of mega-consumers.
The footnote on page 224 about book reviews reminds me of a general phenomenon which is that different reviews of the same book tend to have almost the exact same information. This becomes really clear if you look up a bunch of reviews on Nexis, for example. It can be frustrating, because for a book I like, I’d be interested in seeing lots of different perspectives. In contrast, on the web the implicit rules haven’t been defined yet, so there’s more diversity (as in this non-review right here, or in these comments on Indecision).
The comments on page 231 on the Gaussian distribution remind me of this story where even Galton got confused about the tails of the distribution as applied to human height.
On page 240, Taleb writes that Gauss, in using the normal distribution, “was a mathematician dealing with a theoretical point, not making claims about the structure of reality like statistical-minded scientists.” I don’t have my Stigler right here, but I’d always understood that Gauss developed least squares and the normal distribution in the context of fitting curves to astronomical observations. Sure he did lots of pure math, but he (and Laplace) were doing empirical science too.
I like Galileo’s quote on page 257, “The great book of Nature lies ever open before our eyes and the true philosophy is written in it. . . . But we cannot read it unless we have first learned the language and the characters in which it is written. . . . It is written in mathematical language and the characters are triangles, circles and other geometric figures.” As Taleb writes, “Was Galileo legally blind?” Actual nature is not full of triangles etc., it’s full of clouds, mountains, trees, and other fractal shapes. But these shapes not having names or formulas, Galileo couldn’t think of them. He chose the natural kind that was closest to hand. En el pais de los ciegos, etc.
On page 261, Taleb writes that in the past 44 years, “nothing has happened in economics and social science statistics except for some cosmetic fiddling.” I’d disagree with that. True, I’m sure you could find antecedents of any current method in papers that were written before 1963, but I think that developing methods that work on complex problems is a contribution in itself. There’s certainly a lot we can do now that couldn’t be done very easily 44 years ago.
Reading with pen in hand
To conclude: it’s fun (but work) to read a book manuscript with pen in hand. Also liberating that the book is already coming out, so instead of scanning for typos or whatever, I can just write down whatever ideas pop up.
P.S. Here are my thoughts on Taleb’s previous book.