Of parsing and chess

Screen Shot 2013-05-07 at 9.12.42 PM

Gary Marcus writes,

An algorithm that is good at chess won’t help parsing sentences, and one that parses sentences likely won’t be much help playing chess.

That is soooo true. I’m excellent at parsing sentences but I’m not so great at chess. And, worse than that, my chess ability seems to be declining from year to year.

Which reminds me: I recently read Frank Brady’s much lauded Endgame, a biography of Bobby Fischer. The first few chapters were great, not just the Cinderella story of his steps to the world championship, but also the background on his childhood and the stories of the games and tournaments that he lost along the way.

But after Fischer beats Spassky in 1972, the book just dies. Brady has chapter after chapter on Fisher’s life, his paranoia, his girlfriends, his travels. But, really, after the chess is over, it’s just sad and kind of boring. I’d much rather have had twice as much detail on the first part of the life and then had the post-1972 era compressed into a single chapter. I mean, sure, I respect that Brady wanted to tell the full life story, and I’m not telling him how he should’ve written his book, I’m just giving my reactions.

Also, I would’ve liked more information on the games: what was the amazing set of moves that Fischer did in the so-called Game of the Century, what happened in some of the games he lost, and so on. In an afterword, Brady writes that he decided not to include any games so as to make the book more accessible. What I wonder is, how many readers are there like me, who enjoy chess, could understand a diagram and some discussion of what these amazing plays were, even if we couldn’t follow an entire game written on the page or have the patience to play one out on the board. I wouldn’t have gotten much out of transcripts of chess games, but a few diagrams and discussions of key moments, that would’ve made the book a lot more interesting to me.

P.S. After Kasparov beat Karpov in the final game of their tournament—the game where both players knew that Kasparov had to win, that a draw wouldn’t be enough—I clipped the game out of the newspaper and later played it out with my dad. That was a game. To my ignorant eyes, there was no single point where I could spot a mistake by Karpov. Kasparov just gradually and imperceptibly got to a winning position. Amazing.

19 thoughts on “Of parsing and chess

  1. > An algorithm that is good at chess won’t help parsing sentences, and one that parses sentences likely won’t be much help playing chess.

    This is interesting because this is one of those claims which seems to be correct and valid and carry a meaningful point – and is basically wrong, because one just doesn’t know the full breadth of research and one never thought that it is likely enough to even bother with an offhand Google search. Specifically, there *is* an algorithm which uses chess domain knowledge to help parsing sentences, and using that goes from 40% to 98% on one error rate:

    “Reading Chess”, Baird & Thompson 1990 http://doc.cat-v.org/bell_labs/reading_chess/reading_chess.pdf

    > “In an application of semantic analysis to images of extended passages of text, several volumes of a chess encyclopedia have been read with high accuracy. Although carefully proofread, the hooks were poorly printed and posed a severe challenge to conventional page layout analysis and character-recognition methods. An experimental page reader system carried out strictly top-down layout analysis for identification of columns, lines, words, and characters. This proceeded rapidly and reliably thanks to a recently-developed skew-estimation technique. Resegmentation of broken, touching, and dirty characters was handled in an efficient and integrated manner by a heuristic search operating on isolated words. By analyzing the syntax of game descriptions and applying the rules of chess, the error rate was reduced by a factor of 30 from what was achievable through shape analysis alone. Of the games with no typographical errors, 98% have been assigned a legal interpretation, for an effective success rate of 99.995% on approximately one million characters (2850 games, 945 pages). We discuss several computer vision systems-integration issues suggested by this experience.”

    (The algorithm does not actually attempt to do something like rank the strength of alternative possible moves, apparently because it turns out to be unnecessary since just looking for a parse which is a legal move suffices for all thir games: “It is conceivable that some game interpretation could be legal and still differ from the clear intention of the editor-but such forced interpretations have not been observed. An example of the results of semantic analysis are shown in Fig. 13.”)

      • -_-

        Marcus writes scoffingly of an absurd unthinkable thing, an example he made up on the spot of how natural language processing could have *nothing* to do with chess playing, and I argue not just that it’s possible but I can even point to somethingly almost exactly like his hypothetical which was done decades ago – and what do I get? ‘Well, it’s not *really* parsing.’

        Yeah, whatever.

      • Wow, I just noticed that this was the same Nemenyi who created the “”Nemenyi-Damico-Wolfe-Dunn test”. “Wolfe” refers to Douglas Wolf who I took a Non-Parametric Statistics from while in Statistics Graduate School.

        So it looks like I have 3 degrees of seperation from Bobby Fischer!

  2. If we ignore a few practical constraints and just think about the philosophical problem, then this seems wrongheaded to me. One of the key insights in computational complexity theory is a reduction of one problem to another, as a means to show they are in the same complexity class.

    I don’t know how we want to formally try to describe “parsing sentences” (e.g. whether we just want parts if speech tagging to count, or fancier statistical NLP), but given a formal definition you can make precise statements about whether and how you can transform the sentence problem into the chess problem, and vice versa and whether this transform is computationally efficient (polynomial).

    In short: just cause it “feels like” sentence parsing is a very different sort of algorithmic problem than chess, in some humanistic way, does not really mean the problems are different. In fact, empirical evidence about whether human-engineered sentence parsers are able to be repurposed for chess is also largely irrelevant since it could just be that we have not discovered or invented the right efficient way to transform one problem into the other, which is a sort if result amenable to mathematical proof or disproof.

  3. “And, worse than that, my chess ability seems to be declining from year to year.” would the 7-year-old kick your ass now?

  4. Using chess domain knowledge and playing chess are not the same.
    In the 1970s there were plenty of arguments between A.I. And “brute-force” approaches in computer chess.
    Ken Thompson is more famous for UNIX, but he was also (with Joe Congdon) the creator of Belle, which was for years the top chess computer.
    I still have a Belle tshirt, which altered the old Bell logo into a queen symbol.
    Ken gave a great talk in which he showed examples from the history of computer chess. He also modeled USCF ratings in as a function of raw compute performance, and I think he was generally right, although he if course noted that ratings became less meaningful at the highest levels. Sadly, he was never that keen about giving talks, but I did get him to do a video oral history for the Computer History Museum a few years ago and for a few years we had a computer chess exhibit in the lobby, although it has recently been replaced by a Streetview exhibit if vehicles from a local company.

    Anyway, the algorithms for playing chess and parsing sentences in general don’t have much in common, even if domain knowledge is useful to the latter.

  5. If I may gently put in a plug for Go (http://en.wikipedia.org/wiki/Go_(game)) here. It’s an amazing game that computers still have not conquered, and is supposed to be the oldest game that’s still played in its original form. It features strategy, tactics, esthetics, both straight-forward and Zen-like tactics, and a well-conceived handicap system that allows players of widely-diverging skill to play each other.

    I played chess in high school, but discovered Go in college and have never looked back.

    • Go! yes, different league from checkers or chess.

      I knew the guy who led team that did Chinook, i.e., checkers, which got even with the best ~1994. Chess took longer. I saw computer Go decades ago, and it was clear that it would take a very long time, for reasons decently summamrized in Wikipedia entry. Workable algorithms are very different from the (opening book)- tree-search with more plies with more compute power-(end-game databases) that have been the core of chess computing for decades.

      • While I agree in part with the technical arguments as to why Computer-Go is a harder problem, isn’t a part of the difficulty explained by that Chess algorithms must have received many times more attention than Go?

        • One could argue that; and it’s hard to not notice that it was only a few years after Kasparov went down that Go playing was revolutionized by monte carlo trees and began their recent ascent to pro status. Entirely possible that Go will be done within a decade or two.

        • Partly, but some people have been interested in Go for a long time.
          Let me try again:
          The general approach to chess was certainly well known enough in the 1970s that 2 people could use a small PDP-11 and some custom hardware to build Belle, pretty good for its time. The tree search algorithms were already wellknown long before. Thus:
          Opening book
          Tree search for middle game and end game, but with help from endgame databases as they got developed, in some cases making discoveries about chess rules.

          Ken spoke of relatively straightforward relationships between chess ratings and compute power.
          It took ~30 years for Moore’s Law to do the job, and the real Law (density) isn’t over quite yet,the concomitant clock rate increase is, at least for CMOS.

          My sense is that Go algorithms are nowhere near that close to being well-developed, but I’d be happy to be proved wrong. Perhaps someone can point to an equivalent of Ken’s late 1970s predictions:
          Algorithm, with good model for ratings versus compute power,
          Made by someone with equivalent expertise at computer Go.
          I suggested the computer chess exhibit we did at the Computer History Museum and would happily see a Go equivalent sometime.

  6. How about your statitistical insight? In pure mathematics, the thinking was that mathematicians got their primary insights very young, and then spent the rest of their careers following up. As the saying went, philosophy was what you did once you were too old to do math!

    • Gary:

      The great statistics educator Dick DeVeaux talks about this. In his words, math is like music, statistics is like literature.

Comments are closed.