Writing that changed my life

Gödel, Escher, Bach, Douglas Hofstadter’s 1979 book, came up recently in Andrew’s recent post on ESP. I read GEB when it came out. I was in high school. Not only did I learn a lot from the book, it directly led me to choose an undergraduate institution (Michigan State) that let me tailor all of my classwork toward studying logic, semantics, and AI. I then went to grad school at the University of Edinburgh, an institution known for AI. Then as a professor, I worked on natural language syntax, semantics, and programming languages (usually by applying techniques from the latter to the former). It’s the PL research that led directly to the Stan language.

I realized there were other reads that completely changed my thinking about a subject.

Natural language semantics

As an undergrad (early 80s), Richard Rorty’s Philosophy and the Mirror of Nature moved my thinking from a broadly Platonic (logical positivist) perspective on meaning to a more pragmatic one that focuses on cognitive agents. It was pretty rough giving up Russell and embracing Wittgenstein! I miss engaging with Keith O’Rourke on these topics on the blog.

Natural language grammar

During grad school (mid 80s), Joachim Lambek’s paper on associative categorical grammar led me away from the ad hoc systems of phrase structure grammars (TG, GB, P&P, GPSG, HPSG, CCG, etc.) toward the Curry-Howard morphism and linear logics, which I’d argue is the right way to understand the syntax/semantics connection for the logical structure of language (e.g., quantifier scope, coordination, relative clause formation, etc.).

Statistics

Late in my career as a professor (mid 90s), we hired Chris Manning (now a CS prof at Stanford) and he taught the precursor to his book, Foundations of Statistical Natural Language Processing. That’s what really got me into stats along with speech recognition. It’s far too dated to read now and Chris told me he doesn’t plan to revise it.

Later, when working in industry on speech recognition (early 00s), I somehow stumbled upon Jim Albert and Jay Bennet’s pop intro to stats using baseball, Curve Ball. I can’t recommend this one highly enough. It taught me about uncertainty and all sorts of issues in stats including sampling based inference. The next thing I read in stats that stuck was Gelman and Hill’s first edition regression book (the title of which is too long to repeat here), which really cemented the things I read in Curve Ball with many GLM examples. I needed that, BUGS and the BUGS examples, and to digest a book on math stats before I could broach Bayesian Data Analysis.

Roleplaying games

Most recently (late 10s), I was introduced to fiction-first gaming by Steve Bronder, who ran Grant Howitt’s Honey Heist, Crash Pandas, and Golden Sea for our group. But the real game changer, so to speak, was John Harper’s absolutely brilliant RPG, Blades in the Dark. That took me the rest of the way into fiction-first gaming and now I can’t even imagine going back to the to the Vancian spell books, swingy d20s, and the grid and turn-based mechanics of D&D.

23 thoughts on “Writing that changed my life

  1. I became fond of Japanese (for a variety of reasons, but initially because it’s so different from English), and as a result “Metaphors we Live by” (Lakoff and Johnson; words are polysemous, and when the secondary meanings are different from what English does, those meanings are perfectly sensible from an L&J standpoint) and Fillmore’s case grammar (Japanese marks case independently from the noun) ideas resonated strongly.

    For AI, it wasn’t a book, but auditing Minky and Pappert’s graduate seminar fall term 1972 that set me on my obstreprous way. What they said about neural network models then (they can’t possibly work for computational reasons) remains true today.

    (GEB came out in 1979: 3 years after I had finished an undergrad degree in comp sci and during a year I spent playing with electron microscopes at Tokyo University (don’t ask), so I missed it.)

    • I find Lakoff and Johnson (and their counterpart in NLP by Pustejovsky) to still be too “generative” for my taste. I find some of the philosophical writers on metaphor more compelling, like Searle (though I dislike Searle on the general philosophy of mind) and especially Richard Rorty, following on from Wittgenstein, Quine, and Dewey on the flexibility of meaning.

  2. That was an interesting (and surprising!) list, Bob.

    I was pretty heavily influenced by you :).

    You were the first person on the internet that I communicated with back in Delhi when I got my first modem connection, and you sent me ftp addresses by email to download linguistics material (ALE-related, I think), and that work sent me down a long path to studying categorial grammars and HPSG. Your Type Logical Semantics was an important book for me. It was because of your work that I ended up first in Osaka; Japan was the only country I could actually move to from India (I just had a BA in Japanese, which qualified me for nothing). I studied with Takao Gunji, someone who had worked on HPSG, and spent a lot of time proghramming in ALE (That’s how I got to know your student Gerald Penn, even today Gerald and I meet regularly in Berlin when he visits.)

    My move to Ohio State was also a result of wanting to pursue HPSG. Eventually I realized that linguistics was a terrible empirical mess and that pushed me away to psycholinguistics. At that time I didn’t realize that psycholinguistics was at least as bad, but in a different way, or maybe even worse :).

    • Hey, that’s nice to know. Hopefully we’ll get back to Berlin and meet in person this time. I haven’t thought about all that work in ages. ALE was the first piece of open-source software I distributed. I basically wrote it because people didn’t believe the theory in my logic of typed feature structures book. I interviewed at OSU after my department at CMU fell apart and half the dept. boycotted my job talk for not being a linguist. I often wonder if I’d have moved closer to linguistics if the field wasn’t so anti-empirical. The last talk I tried to give in a linguistics department, at NYU, didn’t go so well. I was talking about using speech recognition for exploratory phonology and phonetics research when the faculty objected that what people say isn’t data for linguistics. I really just couldn’t go on at that point.

    • I’m not the only one among my gaming friends who feels that way. Of my childhood friends I play with online, one of them hated it so much that I gave up and compromised by running a time-traveling Fate adventure (based on another of my friend’s homebrewed setting, which is somewhat reminiscent of an urban fantasy version of El Ministerio del Tiempo, a Spanish TV show).

  3. I enjoyed GEB and learned a lot from it, but it didn’t change my life much that I know of, due to my own limitations probably.

    My reaction to Minky and Pappert saying that neural networks can’t work is that maybe that explains why human thinking is so often flawed. Nothing is perfect, but trial and error can eventually stumble on the truth despite all the errors, as long as it keeps looking. Neural networks are a tool for doing that. They are not the whole story of AI, as I understand it, but a very useful part of it..

    Now that my eyes are getting bad, the neural networks in my visual cortex can tell me that “55” is “89” with complete confidence, until I look closer.

    • “My reaction to Minky and Pappert saying that neural networks can’t work is that maybe that explains why human thinking is so often flawed.”

      Careful there: what Minsky and Pappert said was that a particular stupid, naive, and wrong _model_ of neurons, and the networks based thereon, can’t work. (For starters, it’s been known for a while that the sum-and-threshold model of mammalian neuron operation is simply wrong. The whole computable on a graphics card and therefor fun to play with current generation AI “neural nets” repeat this mistake. It’s a computational model that has nothing to do with mammalian neuronal operation. Nothing, zilch, nada. Nihil ex nihilis. And please, don’t setq nil.)

      The mammalian neuron is a vastly more powerful computational device than the “neurons” in “neural networks”. There was an article in Science that built a “neural network” to simulate a _SINGLE_ typical, average, garden variety mammalian neuron. It took thousands of nodes by five to seven layers.

      Let’s see: the typical mammalian neuron:
      1. Has hundreds of inputs
      2. Has thousands of outputs
      3. Is massively non-local. (Inputs and outputs can be seriously distant)
      4. Computes logical functions of its inputs
      4. Is morphologically diverse (there are _LOTS_ of different types of neurons)
      5. Is irregularly placed in space (in the eye, the photoreceptor neurons are randomly placed, so that, unlike digital cameras, we don’t have Moire problems.)
      6. Is often organized as part of a functional block.
      7. May perform multiple functions. (Mouse neurons that indicate location in _local_ space also indicate location in _global_ space.)

      So the inability of “neural networks” to do much of anything doesn’t say anything about what humans can or can’t do. Or whether or not human thinking ought to be thought of as “often flawed” or as “nothing other than seriously friggin’ amazing”.

  4. The book that changed my viewpoint for life – maybe not changing my life so much – was Richard Goldschmidt’s “The Material Basis for Evolution.” Now whenever I read a scientist making a causal statement about how some trait must have evolved, I frown and think “um, probably not.”

    Speciation remains a contentious topic to this day, and the science gets weirder with each passing year.

  5. I enjoyed GEB, but even moreso relished Hofstadter’s volume, “Metamagical Themas” (actually a compilation of his Scientific American columns I believe), which was more focused on topics of my interest.

  6. The mathematician Gian Carlo-Rota’s “Indiscrete thoughts”.

    The physicist E. T. Jaynes’ “Papers on Probability, Statistics and Statistical Physics”

    The (? – natural philosopher maybe) Clifford Truesdell’s “Six Lectures on Modern Natural Philosophy”

      • Jbayes:

        I took Rota’s differential equations class at MIT. He was a charming guy but the class was super-boring. This was a problem I had with most of the math classes I took: either they were so difficult that I got entirely lost in the details and didn’t have any sense of the big picture (for example, the classes I took in algebra, complex analysis, and real analysis) or they were so easy that they were boring (for example, the classes I took in topology and one or two others that were so boring that I’m forgetting exactly what they covered). It didn’t help that almost all the classes were done in straight lecture format. Some of the professors I liked a lot, and I did all the homeworks and got perfect grades on the exams, but I still didn’t really learn anything. There was never that happy medium of a class that was hard enough to be interesting but easy enough that I could follow what it was all about. Several of my physics classes had that happy medium (not all of them; the class in electromagnetism and the lab course were too hard, and I just followed all the instructions and, again, didn’t really understand anything), but the math classes, nope. Maybe it was just me, that I needed the motivation of an application. Anyway, not Rota’s fault but I got nothing out of his class except memories of this charming guy.

        I googled this book by Rota and read some of the stories. They’re funny, and they seem true to me. They remind me of various obnoxious behaviors by mathematicians I’ve encountered. I remember one guy I talked with, back when I was a young professor. I was visiting this other university and someone recommended I talk with this professor who worked in probability theory. The prof was really aggressive. I was trying to tell him about this result of mine (the proof that the Metropolis algorithm can be time-compressed into something that limits in an Ornstein-Uhlenbeck process, from which we can derive the optimal acceptance rate of 0.234) and he kept interrupting me because he wanted the exact conditions of my theorem. I’d read my Lakatos and I replied that I didn’t know the exact conditions, but I could tell him the theorem, from which it should be possible to derive the conditions . . . I didn’t say this in a rude way at all, but he didn’t want to even listen to me until I could state precise conditions. It was weird, so different from how I do math. But what really struck me was how aggressive he was. He was aggressive in a weird way, kind of as if he thought he was being charming.

        I like Rota’s description of Ulam, which confirms my view that he (Ulam) was my kind of guy. Rota seems cool too, but not quite “my kind of guy” in that he seemed to have been within math rather than a user of math.

        • I’m jealous! Rota’s field was combinatorics, although he had written something about how awful DiffEq courses were or needed to be taught different or something along those lines, it was not his field.

          Rota’s work in combinatorics struck me as going much deeper while simplifying and unifying as much as possible. Perhaps his combinatorics course would have left a different impression?

          I especially like the “the lost cafe” part of Ulam’s story.

        • Jba:

          I think combinatorics may be one of the boring math classes I took. Prof was not Rota, but the problem there was not the subject, it was that it was designed to be one of the easy courses in the math dept. Again, though, if it had been designed to be a hard course, I’m guessing I also would’ve learned nothing!

          Really kind of amazing how I got a math degree from MIT with perfect grades, while learning approximately no math! How did that happen??

          I learned a lot in my physics and political science classes, though, and I learned a ton from the one mechanical engineering class I took (the legendary 2.70). Also writing my undergraduate thesis, I learned a huge amount from that too. And from the statistics class I took at Harvard.

        • I loved combinatorics. But seeing as I was at Michigan State, they let me skip the easy one and jump into graduate level random graph theory. Ed Palmer was my undergrad advisor and Erdös regularly visited to collaborate (as an undergrad of limited mathematical talent, I wasn’t involved in those late-night sessions).

          I really just loved abstract algebra and topology and really didn’t get real analysis. I wish I understood complex analysis in much the same way as I wish I knew how to play drums or the bass guitar, though it’d be more useful these days. I also passed E&M by rote. It was independent study with tests just after the material, so I could do it by pattern matching.

  7. I am currently reading C. Wright Mills: The Sociological Imagination. This is one of the most influential books in sociology and Mills wrote it in 1959.
    Mills critizises contemporary schools of sociology and discusses what a fruitful social science should look like. I feel that is critique is very much on point even today, 60 years later.

    Today’s social science he claims is dominated by what he calls “abstracted empiricism”. This is the type of social science that is mostly done today and that probably most readers on this blog do as well: Collecting empirical data of some sort about a often very specific problem and analyzing it using fancy statistical methods.

    In his opinion this type of research is inhibited by it’s own rigid methods called ‘The Scientific Method’ which proposes methods for social science which mimic methods of the natural sciences. This method only allows to study small scale problems that are not substantive and of little interest. Social structure and history are neglected and many very detailed analyses of unrelated problems are done. The result is “a series of unrelated and rather unimportant facts”. These very many studies can not be simply put together to form a big picture and there is no real progress in social science.
    I don’t know what he would say today with the rise of causal inference…

    I am currently doing a PhD in sociology. And in my opinion almost every critique of his seems to be on point if it comes to my own work. Very detailed, technical work without much context. Getting rather unimportant details (parameter estimation) right. No real concepts. A definition of understanding that seems to be restricted to statistical associations between variables in a rather small time frame. I wonder if there might be an actually quite big gap between understanding in social science and statistical modeling.

    The book features an afterword by Todd Gitlin written in 2000. Todd Gitlin is a former professor of sociology at the University of Columbia. I am very excited to read this more or less up-to-date assessment of the work of Mills.

    I wonder if some readers on this blog know this book. Would love to hear opinions!

  8. Bob, did you take coursework from Barbara Abbott at MSU? I really admired the way she taught semantics. The cognitive science specialization at MSU was just getting started when I was there in grad school (in Psychology).

    • Indeed. My two worst grades as an undergrad were in her intro to linguistics and in Calc 3. I would argue that transformational grammar was to blame for the first bad grade and my poor spatial skill for the latter. Funnily enough, I became a mathematical linguist.

      Later, I took a class co-taught by Barbara and Herb Hendry in the philosophy department. It was on Montague Grammar and it set me on the direction for my Ph.D. thesis (with Rorty and pragmatism still pushed down in the back of my mind). I had taken math logic, an analytic philosophy class focusing on semantics, a non-standard (mostly modal and n-valued) logic class, and an AI class based on Lisp beforehand, so I was well prepped for the class. The type-logical semantics book of mine to which Shravan alluded above directly came out of this. Lambek’s gramma extends neatly to quantifying-in, giving a sequent-based generalization of Montague (and as a bonus, if you view it as a natural deduction system, it fixes all the problems of Cooper Storage [as an aside, Cooper was the internal examiner for my Ph.D. thesis at Edinburgh]).

      Then at the ’87 LSA Summer School, I TA-ed a class in natural language processing with Prolog taught by Gerald Gazdar that both Barbara and my partner Mitzi took. It was 8 am and that’s why Mitzi and I tell people that we slept together before we met.

  9. Andrew said, “I learned a lot in my physics and political science classes, though, and I learned a ton from the one mechanical engineering class I took”

    My physics classes (in both high school and college) had a big influence on me — especially in accepting that uncertainty is important in understanding the real world. In particular, the Heisenberg uncertainty principle (that I learned about in a third physics course in high school) really influenced my world view, and in my view of science.

Leave a Reply

Your email address will not be published. Required fields are marked *