Chess cheating: how to detect it (other than catching someone with a shoe phone)

This post is by Phil Price, not Andrew.

Some of you have surely heard about the cheating scandal that has recently rocked the chess world (or perhaps it’s more correct to say the ‘cheating-accusation scandal.’) The whole kerfuffle started when World Champion Magnus Carlsen withdrew from a tournament after losing a game to a guy named Hans Niemann. Carlsen didn’t say at the time why he resigned, in fact he said “I really prefer not to speak. If I speak, I am in big trouble.” Most people correctly guessed that Carlsen suspected that Niemann had cheated to win. Carlsen later confirmed that suspicion. Perhaps he didn’t say so at the start because he was afraid of being sued for slander.

Carlsen faced Niemann again in a tournament just a week or two after the initial one, and Carlsen resigned on move 2.

In both of those cases, Carlsen and Niemann were playing “over the board” or “OTB”, i.e. sitting across a chess board from each other and moving the pieces by hand. That’s in contrast to “online” chess, in which players compete by moving pieces on a virtual board. Cheating in online chess is very easy: you just run a “chess engine” (a chess-playing program) and enter the moves from your game into the engine as you play, and let it tell you what move to make next. Cheating in OTB chess is not so simple: at high-level tournaments players go through a metal detector before playing and are not allowed to carry a phone or other device. (A chess engine running on a phone can easily beat the best human players. A chess commentator once responded to the claim “my phone can beat the world chess champion” by saying “that’s nothing, my microwave can beat the world chess champion.”). But if the incentives are high enough, some people will take difficult steps in order to win. In at least one tournament it seems that a player was using a chess computer (or perhaps a communication device) concealed in his shoe.

I don’t know if there are specific allegations related to how Niemann might have cheated in OTB games. A shoe device again, which Niemann uses to both enter the moves as they occur and to get the results through vibration? A confederate who enters the moves and signals Niemann somehow (a suppository that vibrates?). I’m not really sure what the options are. It would be very hard to “prove” cheating simply by looking at the moves that are made in a single game: at the highest levels both players can be expected to play almost perfectly, usually making one of the top two or three moves on every move (as evaluated by the computer), so simply playing very very well is not enough to prove anything.

But even if it’s impossible to prove cheating by looking at the moves in a single game, there’s additional information available from the arc of a player’s career. And although cheating in online games could, in principle, be at least as hard to detect as cheating in OTB chess — actually it should be harder, because you can’t do a physical search of the players! — in practice the ways people cheat in online games seem to be rather easily detectable. A company called chess.com that is used by millions of players (including me!) has posted a long article about their cheat-detection methods, specifically as regards Hans Niemann — who has, they believe, cheated in many games, including when playing for monetary prizes — and it lists several ways they look for cheating. Their bullet list is a bit confused so I’ve cleaned it up here:

  • Comparing the moves made to engine-recommended moves (after removing some opening and endgame moves that are memorized and thus uninformative about player skill).
  • Comparing player past performance to their historical strength
  • Comparing a player’s performance to comparable peers
  • Looking at behavioral factors (e.g. ‘browser behavior’)
  • Reviewing time usage for finding easy vs difficult moves.

The ‘browser behavior’ item needs explanation. If you play on chess.com, they can tell when your computer’s ‘focus’ is on the window where you are playing. If you click to another window, they can tell. At least two things would be suspicious: (1) clicking away from chess.com after every move or almost every move, since this would suggest you might be entering moves into the engine and looking at results, and (2) playing stronger moves after returning to the chess.com window from somewhere else, as compared to the moves you play when you stay on the chess.com window for multiple consecutive moves.

Item (2) requires an assessment of move quality, which is usually summarized in the pleasing unit of “centipawns.” By tradition, a pawn is considered to be “worth” one point on average, and chess programs are programmed to evaluate positions by assigning points according to that scale. If you a move causes you to lose a pawn (without other compensation such as a stronger position) then, as far as the program is concerned, you have lost a point. To evaluate a human’s move you can compare their position after their move to the position if they had made the move the computer thinks is best. If they made the computer move, they have a loss of ‘0 centipawns’. If the computer thinks their position is 0.12 pawns worse than it would have been if they had made the computer move, then they have lost 12 centipawns. Measures such as ‘average centipawn loss’ can be used to evaluate player strength, both for online play and OTB play (and even for players who played long before the advent of computers). Chess.com creates a “Strength Score” based on this sort of information (I don’t think they have published their algorithm); this is highly correlated with the player “ELO ratings”, which are based on win-loss-draw records against other players but chess.com evidently prefers it to using the ratings alone.

In online play, if a player averages a 4 centipawn loss per move when they stay on the chess.com window, but only a 1 centipawn move when they return to it after clicking away to another window, that suggests cheating. (But it doesn’t prove cheating. Perhaps you only click away to check Facebook if the position is very simple and you know you can pick the best move). It seems to me that if you were going to cheat in an online tournament you would simply have a second computer to run your chess engine. Maybe that’s what people will do, now that they know chess.com can and does detect when they click away.

By using these approaches, chess.com has concluded that Niemann “very likely” cheated in a great many individual games and in many online tournaments. You can read the chess.com article for a list. (By the way, Niemann has admitted to cheating in some online games, although he claims never to have done so when money was at stake.)

Other than ‘browser behavior’, the other methods chess.com can be used for detecting cheating in OTB chess as well…and those also make Niemann’s performance look very, very suspicious. For example, considering only OTB chess, Niemann’s improvement from age 11 through 19 has been faster than any other player in history. Second on the list is Bobby Fischer, fourth is Magnus Carlsen. This alone doesn’t prove anything. We know Fischer wasn’t improving because of an engine, and yet he had this huge improvement, and who’s to say nobody could ever do it faster than Fischer? But either Niemann improved faster than any other chess player of all time, or he cheated.

Finally, there’s some other, less quantifiable evidence that Niemann has cheated in OTB chess. For example, many grandmasters supposedly find that his post-game discussions of his games show lack of a deep understanding of the positions that he just played. And some grandmasters think Niemann’s pattern of time usage is suspicious: that he takes too little time in subtle positions but still finds the best moves, while sometimes taking too much time in clearer positions where he should be able to move more quickly. (Perhaps I should have mentioned earlier: all games, online and OTB, are played using a chess clock so that players are limited in how much time they have. The amount of time allowed is different from tournament to tournament).

I guess that’s enough for now. I think the evidence that Niemann cheated very frequently in online chess is extremely convincing, and I think all of the smart money is betting that he has also cheated OTB. Certainly that’s where my money is, and my money isn’t even smart!

If you have any interest in this general topic, I highly recommend reading the chess.com article (link is in the fifth paragraph, above).

This post is by Phil.

67 thoughts on “Chess cheating: how to detect it (other than catching someone with a shoe phone)

  1. >> In both of those cases, Carlsen and Niemann were playing “over the board” or “OTB”

    Incorrect. The second case was online tournament, not OTB.

    • This is not the only weakness in the post. Here are some others that stood out to me (I’m not a chess fanatic, but I have been following the controversy):
      1) The post claims “Carlsen didn’t say at the time why he resigned, in fact he said “I really prefer not to speak. If I speak, I am in big trouble.”” He never said that. In fact, he tweeted a famous clip of soccer(/football) manager Jose Mourinho saying that: https://twitter.com/magnuscarlsen/status/1566848734616555523?lang=en.
      2) The post says “Perhaps [Carlsen] didn’t say so at the start because he was afraid of being sued for slander.” While this may be true, there are penalties Carlsen could receive from FIDE (the organization that runs chess) for make unsubstantiated accusations of cheating. He may still be penalized for his behavior.
      3) The post says “at the highest levels both players can be expected to play almost perfectly.” I would love to see a citation for this. At the highest level, human players get completely demolished by computers.
      4) There are 3 paragraphs towards the end that are basically just speculation about Hans cheating and don’t seem to be doing a good job of interpreting the data (cherry picked subset of data) or the opinions of professional players (no names, ignoring contrary positions).

      I’m kind of surprised to see this content on this blog. Normally I expect a higher degree of self-awareness when the writer is speculating or talking about something outside of their core expertise (Andrew just spoke to Domain Specificity in https://statmodeling.stat.columbia.edu/2022/10/04/dow-36000-but-in-graph-form/!).

      • Anon,

        I can’t speak for Phil, but:

        1. OK, Carlson didn’t say that particular phrase, he tweeted someone else saying it. That doesn’t sound so different to me. The word “say” is often used to convey nonspoken communication; for example if someone types something, I might say that they “said” it.

        2. I don’t see your problem with Phil’s post here. He says “Perhaps” and then you say, “While this may be true . . .” That’s “Perhaps”!

        3. Regarding “almost perfectly,” a lot depends on how you define “almost” and “perfectly.” I guess it would’ve been more accurate to say that at the highest level, major blunders are rare. I took that to be his meaning, i.e., that top players play “almost perfectly” compared to the way that ordinary folks play.

        4. Tastes differ, I guess, but to me Phil was pretty clear about adding qualifiers in his speculations near the end of his post, using phrases such as “chess.com has concluded,” “less quantifiable evidence,” “many grandmasters supposedly find,” “some grandmasters think,” and he writes, “my money isn’t even smart,” i.e. he’s expressing an opinion outside his core expertise. So I think he’s pretty self-aware here.

        • Hi Andrew, thanks for the thoughts. re 2: this was my polite way of saying “when I read this, it made me think the author didn’t understand somewhat basic details of what is happening in this controversy.”

          I agree that there are charitable readings of my quotes that make them more OK. If anything, I hope you take my previous comment as praise that I find this blog typically has very high quality discourse! I think if the “my money isn’t even smart” or similar had been at the beginning of the post I would have had a different reaction to the post as a whole.

      • Thanks, Andrew! You did a pretty good job speaking for me but I’ll speak for me too.

        1 is pedantic but correct. I should have said Carlsen posted a video of someone else saying “I really prefer not to speak. If I speak, I am in big trouble.”
        2, and 4 seem like sort of silly complaints about a blog post so I won’t bother to address them.

        To me, the only interesting thing here is 3. “At the highest level both players can be expected to play almost perfectly.” Of course this hinges on how one defines “almost perfectly.” Andrew suggests that I mean “compared to the way ordinary folks play”, but no, I mean they play almost perfectly as judged by the best chess-playing entities in the universe: the chess engines.

        Let’s start with inaccuracies. I don’t know that there’s an official standard for “inaccuracies”, “mistakes”, and “blunders” but I know that several chess analysis sites have the same definition: a move that loses 40-90 centipawns is an “inaccuracy”, a move that loses 90-200 centipawns is a “mistake”, and a move that loses more than 200 centipawns is a “blunder.” This article https://fivethirtyeight.com/features/brilliance-and-blunders-have-defined-the-world-chess-championship/ tabulates inaccuracies, mistakes, and blunders at the most recent World Chess Championship. In 3 of the first 10 games, neither player had any inaccuracies, mistakes, or blunders. In two more of those games, one player had 0 inaccuracies and the other had 1 in accuracy, no mistakes or blunders by either. In 7 of the first 10 games, neither player had a mistake or blunder. So, that’s one thing I mean when I say “almost perfectly”: very very few bad moves.

        One can also look at other measures such as “average centipawn loss”, also given in that article for those ten games. In six of the games, neither player had an average centipawn loss in double digits.

        By the way, the ability to play at this level is new. Three of those ten games were the three most accurate world championship games ever played, with both players having an under 4 centipawn loss per move. I know at least one of those (game 3) was a rather quiet draw the started down a very well-known line. I dunno, maybe it got very easy to play (for a super-grandmaster) after that. But it did have a novelty on move 15 — a move never before seen in recorded play — so it’s not like the players were just playing out a bunch of predetermined moves.

        Other people have also looked at metrics like “how often does the human play one of the two best moves as evaluated by the computer”. If you search around you can find lots of discussion of this stuff, and rankings of players from history, etc.

        I think we can agree that the best human players can play almost perfect chess for some definition of “almost”. The best players can play long, complicated games without ever making a bad move, and nearly always choosing one of the moves the computer thinks is best. To me, that’s a reasonable use of “almost perfect.”

        • “But it did have a novelty on move 15 — a move never before seen in recorded play — so it’s not like the players were just playing out a bunch of predetermined moves.”

          Move 15 is pretty deep in the game! That in itself suggests why the move had never been seen before. A possible common reason that some moves have never been seen before is that they may be well-known to many players – especially in a game where all the moves have been played up to 15 moves into the game – but are indifferent to the outcome of the game. The only reason to play them is the slight hope that the other player doesn’t recognize the move and commits an error in response to it.

  2. Not the standard I’ve come to expect from this blog, nothing I couldn’t have read on a layman’s site.

    Also very unbalanced regarding the opinions of chess GMs on Niemann’s OTB play and the statement about Niemann’s rise being the fastest since Fischer is actually quite a cherry-picked statistic (something that should probably be picked up on here).

    In addition, no mention regarding IM Ken Regan, who is an expert in this arena having no indication of cheating OTB or online in the past 2 years.

    See https://twitter.com/NikolaosNtirlis/status/1577411761275568128 on Niemann’s rating rise compared to others

    https://www.theguardian.com/sport/2022/oct/05/expert-claims-hans-niemann-cheated-in-numerous-professional-matches

    https://en.chessbase.com/post/is-hans-niemann-cheating-world-renowned-expert-ken-regan-analyzes

    • Hahaha you’re criticizing the post for not citing a specific person who you think should be cited? Get real.

      I’m sorry if I implied somewhere that there are no other places one can look for discussion of the cheating scandal. Thank you for providing some links. But you left out many others! That’s pretty lame. Why didn’t you post them all?

  3. Or he plays/thinks like a computer because he grew up playing very good computers (with the associated tools), and was very good at learning from them. Think about how many more chess puzzles you can solve in an hour if there is no need to manually set up the board each time, which is not true for prior generations. So the only relevant comparisons are to his peers with the same access to the technology and time to learn from it.

    Not saying whether there was cheating or not, just that this isn’t going to be solved with statistics.

    I’d also say these events seem to have been good marketing for chess, since I have now had a friend get me to start playing online. But in the end, the underlying issue is that chess is now a “solved problem”. So it needs to be made more complicated, eg 3-D chess from Star Trek. Funnily enough, if cheating OTB *did occur* somehow, that also reminds me of Star Trek (Kobayashi Maru).

    • “But in the end, the underlying issue is that chess is now a “solved problem”. So it needs to be made more complicated,”

      I disagree with this. Chess and Go (and driving railroad spikes) are plenty hard for people and always will be. So human-only play will continue to be interesting no matter what the machines are doing.

      (As I’ve said before, it’s interesting that computer chess hasn’t had a major impact on chess theory, whereas computer Go has been revolutionary: joseki (local opening tactical sequences) and fuseki (whole-board strategy) have been completely revamped.)

      • Ok, but there will be a never-ending stream of cheating accusations (and some, or even many, will be actual cheating). Once cheating is trivial, it becomes an easy to accept plausible explanation for anyone who doesn’t like the results.

        Making a game hard enough to cheat is a crucial aspect of designing a good game. That is why in poker you have another player cut the cards, etc.

        • In OTB chess I think it should be possible to prevent cheating, at least in high-level tournaments for which it’s reasonable to devote significant resources to it. Use a highly sensitive metal detector on the players, and play in a Faraday cage.

        • The easiest way I’m thinking to overcome that obstacle is get access to what the opponent is practicing before the match. Eg, their chess.com (or whatever they use) history. Second is you could poison the data for the ML algorithms they use by having enough training data under your control.

          This can go all the way to having some kind of implant that tracks the position of your fingers/toes/whatever then vibrates or causes some kind of sensation in pulses to tell you the best move.

          If the latter is not possible right now, it won’t be long.

          The cheating is a symptom, not the root problem. Of course friends who trust each other can still play and have fun.

        • Niemann is currently playing in the US Chess Open in St. Louis, where strong anti-cheating precautions have been taken. Metal detectors, delayed broadcast, etc. Everyone is waiting to see if he does well, or poorly.

        • What Phil said. OTB will remain fine.

          One thing that crossed my mind, though, is should people who have cheated at online chess be excluded from OTB chess? For non-cheaters, cheaters are really irritating. So I get why Magnus has indicated that he’ll refuse to play in OTB tournaments that Niemann is allowed to participate in.

          On the other hand, my (now ex???) favorite chess YouTube chess channel seems solidly behind Niemann on the grounds that he plays interesting chess and thus it’d be a loss to chess were he excluded. (Also that suspicion of cheating is really dangerous in that people who haven’t actually cheated my be excluded.) This latter bit is an excellent argument (and why the US judicial system is truly beautiful (at least in theory)) but things like chess and music are high stress enough that it’d be nice if the folks who don’t cheat don’t have added stressors.

          Complicating all this is that online chess is way more important than it was 3 years ago, and it looks like it may continue to be (with new Covid variants and the flu expected to be rampant). Sigh. At least here in Japan everyone is still masking.

        • The easiest way I’m thinking to overcome that obstacle is get access to what the opponent is practicing before the match. Eg, their chess.com (or whatever they use) history

          Don’t think anyone would think of that as cheating, unless I’m misunderstanding.

          Second is you could poison the data for the ML algorithms they use by having enough training data under your control.

          The strongest chess engine, and the sorts of chess engines that can run on an anal bead, do not use machine learning in the sense of making use of training data. They simply search through a deep lookahead tree on each possible move and evaluate the board positions with a empirically tuned heuristic function, with aggressive use of pruning and heuristics to make it computable.

          This can go all the way to having some kind of implant that tracks the position of your fingers/toes/whatever then vibrates or causes some kind of sensation in pulses to tell you the best move.

          If the latter is not possible right now, it won’t be long.

          This is certainly possible now. The hard part is making it undetectable, which is equivalent to inventing a whole new class of non-silicon processors and non-RF transmission technologies, which I daresay might be a long time.

        • @Somebody

          I assume putting a camera in the players hotel room to watch them practice before the match would be cheating. Same with using whatever history they have on their own computer or a server somewhere. Excluding game they meant to be public of course.

          Also, your info on the chess engines is incorrect:

          Like many others, Stockfish has included neural networks in its code to make even better evaluations of chess positions.

          https://www.chess.com/terms/chess-engine#stockfish

          I haven’t looked but that step is probably used to select which branches are searched.

          And finally, I don’t think you need a whole new class of microchips to make a cheating implant. It just has to be small enough, ie below the sensitivity of the metal detector. RF isn’t needed at all if you can do the inference on chip. But inputing each move without RF seems like the hardest part. I was imagining an accelerometer in the playing hand.

          If its worth millions of dollars, someone will eventually try it.

          Anyway, these are just a few possible implementations of the cheating. Playing whack-a-mole with them doesn’t solve the underlying problem: It is too easy to cheat nowadays.

        • I assume putting a camera in the players hotel room to watch them practice before the match would be cheating. Same with using whatever history they have on their own computer or a server somewhere. Excluding game they meant to be public of course.

          There are no private game archives on chess.com. All games are public once played, so it wouldn’t be cheating.

          Also, your info on the chess engines is incorrect:

          Like many others, Stockfish has included neural networks in its code to make even better evaluations of chess positions.

          What I said was “in the sense of making use of training data.” Stockfish’s neural network evaluator is trained on the results of its classical evaluator with a high depth search, then iterates. The set of positions for this are typically bootstrapped a standard dataset then generated within the program from self play, but and the scores are always generated intrinsically by the program. The ones that lean more heavily on deep neural networks and monte carlo tree search don’t use any kind of game history, data are generated entirely from self-play, hence the “zero” in AlphaZero and Leela Chess Zero. You could argue that these still makes use of some kind of “training data” that are fed into a neural network, but that’s just semantics. The point is that there’s no data stream you could “poison”.

        • You’re talking about a microchip with onboard accelerometers for tracking finger position, which then processes that accelerometer data to read the game state, runs a chess engine, and outputs haptic feedback at an unambiguously perceptible level without disrupting the accelerometers or otherwise damaging the system, and both the chip and its power source have to be below the sensitivity of their metal detectors and embedded safely inside of someone’s hand.

          If its worth millions of dollars, someone will eventually try it.

          “Eventually try it” is very different than “will succeed before long”

      • @Somebody

        That is interesting they train it using computer generated data. Even in that case though, it is pretty much standard in ML to play with hyperparameters and keep trying until you get the result you want.

        As for the metal detectors, you didn’t provide any evidence but they don’t respond to RFID tags. So I guess we would have to figure out how much metal it takes to trigger them without too many false positives and how much must be in a device capable of doing the chess inference.

        I doubt it is as implausible as you convey, but maybe you are correct.

        • Passive RFID tags draw power exclusively from induction with the reader; that is to say, there is no battery. Not only would the device you propose need an embedded battery that lasts the duration of the tournament, but even moving electromagnetic parts for haptic feedback, probably an LRA actuator. Quickly googling, the smallest one I can find is this:

          https://www.vybronics.com/coin-vibration-motors/lra/v-g0640001d

          which is 6×4 mm, small, but pretty large to have implanted. All haptic feedback is kind of big though, because to generate an impulse, there needs to be a travel distance. It also draws 58 mA of current. A watch battery will have around 18 mAH, so you’ll need more than a watch battery JUST to power the haptic feedback, and more than that, and a watch battery will set off a metal detector.

        • Those batteries aren’t designed for this purpose though:
          https://techxplore.com/news/2022-02-world-smallest-battery-power-size.html

          Also look up “metal-free battery” or “polymer-based battery”. And most of the metal is probably in the battery enclosure, which is not required.

          It’s more like an interesting research problem. How much memory and computing power can be packed into a small space, without setting off a metal detector.

          Either way, the cheating accusations will get more and more frequent until people get fed up with competitive chess. It is simply too easy to cheat, bandaids like metal detectors only delay the inevitable. The game will need to be made more complicated in the future.

        • Making the game more complicated won’t help at all: computers will still be better than any human. Go is much harder than chess but computers rule there too.

        • The problems are:

          1. Processors that small, which fit into a sufficiently small TDP envelope, and which can run a chess engine, do not exist. You could *maybe* make a chess engine ASIC with a few years of engineering and a top of the line fab that meets these requirements, but a general purpose processor running a multitasking operating system and stockfish on that—no dice.
          2. The complexity of a chess program means that you can have zero power surges or outages. As soon as the power cuts, you’re going to flush your memory. The computers in your link are single purpose sensors.
          3. The actuator must provide enough perceptible force in response to a computer for a cheater to take action. That means some part of it must be electromagnetically sensitive, and it must draw a minimum amount of power many times 100 mW just by the laws of physics. The smaller the electromagnetically sensitive moving part, the larger the required power draw.

          All of these are nontrivial problems, guaranteed to require custom hardware, not off the shelf parts even 10 years from now. You’re pointing at active research frontiers, the most optimistic of which are nowhere near the minimum requirements, and saying that it’s “too easy”. The world of speculation is easy, but actual machines are hard. Yes, there are millions of dollars at stake, but a potential cheater would have to find a R&D team capable of solving these hard problems, small enough to keep quiet, and pay them more than they would make working on similar problems elsewhere to more prestigious and scrupulous ends.

          Building such a device would be a lot more difficult than some clever social engineering. Pay a doctor to claim you were deafened by an accident, get a cochlear implant with some custom modifications.

          Or, inherent to all competitive games, just spike your opponent’s drink. Get ‘em tipsy enough to lose to you. This is going to be possible no matter what.

          What you want is the expected value of cheating to be lower than the expected value of not cheating. One way is to make cheating hard and expensive, which metal detectors do an okay job at. The other side is making the consequences of being caught extremely dire.

        • Either way, the cheating accusations will get more and more frequent until people get fed up with competitive chess.

          In 2017 and 2018, the Houston Astros won at the highest level of baseball competition in the world by cheating. They were also decisively caught—their methods were not very sophisticated, amounting to basically planting a camera in center field with a live feed to their dugout. They kept their 2017 world series title and there were no consequences whatsoever for the players involved.

          Cheating was easy, and had no consequences, and people still watch baseball. People are mysterious

        • Well, I stopped caring about professional sports after Shaq was on his third team. He was clearly “franchise player” material. The lack of loyalty between players and teams made me decide it made no sense to be loyal to a team. Then I watched a game last year and everyone was basically a small forward. This optimization made the game less entertaining than how it used to be.

          But it will be a slow decline in interest of course. The scandals first increase interest, until people get bored of the scandals too. Then if there is another option that satisfies the same needs/wants they will devote their attention to that instead.

          I’m not particularly good at chess and never followed the competitions, that is just human nature.

    • Also, I watched this video:

      https://www.youtube.com/watch?v=DCeJrItfQqw

      In the interview, he essentially attributes his success to having more prior knowledge about how the (more-well known and experienced) opponent plays. He also rejects the computer-determined win frequencies at one point.

      Perhaps this was a victory of bayesian reasoning over frequentist reasoning. I have noticed that I regularly do “impossible” things according to some people who use NHST* reasoning on this blog. It seems like magic to them.

      But also, I don’t think it would be that difficult for him to cheat.

      *which is not equivalent to frequentist

      • There are some chess openings for which known lines go on forever, by which I mean fifteen moves or more. Some of these are known draws, and if a grandmaster goes down one of them without trying a novelty then, well, they’re playing for a ‘grandmaster draw’: occasionally both players have an interest in a draw in which they don’t have to exert themselves, so they just play through the moves until they’re at a position that is rather dead, and then one player offers a draw and the other accepts. Indeed, sometimes the draw is offered so early that it’s considered unseemly: it’s obvious to everyone, even hoi polloi, that the players weren’t trying to win. That’s considered uncool, especially by the tournament organizers who are trying to ensure that the games are worth watching for at least some audience.

        It is also routine for players to prepare by studying the openings favored by their opponent. This preparation can be very deep if there’s time for it. A player might study literally every OTB game his/her opponent has played in a tournament in the past five years, often going through them with the help of a chess engine to look for improvements, as well as looking at what is right or wrong with other moves that they or their opponent might make, searching for a ‘novelty’ (a move that has never been seen before in tournament play) that leads to a juncture where the opponent might make a mistake. If you are playing through moves that you have prepared this way, you are said to be “still in your prep” and of course it’s a big advantage if you are in your prep and your opponent is not. When Carlsen quit the tournament after losing to Niemann, and didn’t say why, there was some speculation that Carlsen didn’t suspect Niemann of using an engine but rather suspected him of somehow knowing what lines Carlsen had prepared; presumably this would have had to have been leaked to Niemann by someone in Carlsen’s camp, which seemed very unlikely to just about everyone, but there was also nothing about the game itself that led even strong grandmasters to think it stood out as an example of likely cheating. So, yeah, it’s possible for prep to go very deep and for a player to surprise another player with a prepared line. Usually this involves deliberately playing a move that is _not_ one of the computer’s favorite moves, because the opponent will have studied the favorite moves. You need to deviate by finding a line that is not optimal according to the computer but for which your opponent is unlikely to find the refutation. Some positions are easier for humans to play than others, even if a computer thinks they’re equal or even that the hard-to-play position is better. Two of my favorite chess players, Judit Polgar and Mikhail Tal, are famous for sometimes playing “unsound” moves that put their opponents in hard-to-play positions. Tal once said “You must take your opponent into a deep dark forest where 2+2=5, and the path leading out is only wide enough for one.”

        So, yeah, it’s always possible that Carlsen walked into Niemann’s opening prep. But: Carlsen had never played this line before, even online, so it would be pretty amazing if Niemann had prepared it.

        But also, related to the general issue of preparation: most grandmaster games depart from previous openings surprisingly quickly. There’s a popular maker of chess videos who goes by the name “Agadmator” on YouTube, where he plays through games and points out subtleties and complications and so on. When he reaches the point that the first game departs from any previous game in the grandmaster database, he says “and it is at this point that we have a brand new game.” Although this might not happen until move 18 or something in an extreme case, it often happens when each player has only made 12 moves or so. There are a lot of known variations but the combinatorial explosion is such that you can play moves that look reasonable to a grandmaster and still find something brand new when the game is young.

        Just for laughs I looked at one of my own recent online games. I was playing black, here’s the game:
        1.e4 d6 2.d4 Nf6 3.Bd3 e5 4.d5 Nbd7 5.c4 Nc5 6.Qc2 Be7… and there we go, in the recorded history of master-level chess games we are already in a new position. According to the engine Stockfish, searching with a depth of 23 plies, white has about a 30 centipawn advantage, which is just slightly less than the advantage White has at the beginning of the game, so it’s not like either player has screwed up yet. Oh, actually I can search all of the millions of online games played on Lichess and Chess.com -in the past two years or so and the position has never occurred in any of those games, either! (Before I played Be7 the position had occurred once in master-level play or above, twice on chess.com, and three times on lichess). According to the engine, 6. Qc2 is not a terrible move but it’s not something good human players ever play, or even decent human players really. That move is the big filter: there are thousands of games that are identical to this one through move 5.

        My point with all of this is: we are a long, long, long way from the point where a lesser player can expect to beat a superior player through opening prep. It’s not that it can never happen, and indeed it does occasionally happen, but it’s rare. At some point, often quite early, a player chooses from among seven or eight candidate moves that are roughly equally attractive, and the opponent does the same, and voila they are in a position that nobody has ever seen on the board. A couple more moves and they’re in a position nobody has ever seen in their prep either.

        So, yeah, to your point: it’s impossible to say from this game that Niemann was cheating, just because he played with engine-like precision for many moves. At the level he is _supposedly_ capable of playing, we do see that sometimes; and also, even if he was not cheating and is not capable of playing at that level it is _possible_ that he got very lucky with his opening prep. Without catching someone with a phone in his hand (or his shoe) it’s always going to be hard to prove cheating. But catching a world champion based on opening prep is extremely unlikely.

        • Chess is a really intense game.

          In the very last tournament I played, there was (at least but probably not more than) one woman player, and during a break I noticed her working on the Center Counter Game as black (1. e4 d5 2. exd5 Qxd5) with her (older, male) trainer. It was then (and still may be) theoretically inferior for black, but it’s still played occassionally. A couple of rounds later, I get paired against her with me as white. I _only_play 1. e4. I had never studdied any other openings as white (nor the center counter game, oops). But, hey, it’s theoretically OK. So I play 1. e4. And hold on through the middle game to position with all the pawns blocked. But I’m exhausted, so I offer a draw, which is excepted. Whew, I survived a trap. But then her trainer comes along and we start going over the game. I mentioned that I saw them working on it, and I get screamed at “You’re an idiot for walking into it.” Then we’re crunching along, and we get to the point where I offered a draw. It turns out that my queen could have been brought around to the other side of the board and made it to the back ranks. Kewl. Thanks. I didn’t understand that idea. “Next time I see a similar position, I’ll not resign so quickly.” And I get called an idiot again because they thought I meant the exact same position. Sheesh, calm down guys, it’s just a game.

          (For Andrew G, since these were taken at Columbia (spring 1972))
          Like they say:
          The thrill of victory. https://pbase.com/davidjl/image/170678416
          The agony of defeat. https://pbase.com/davidjl/image/170678418

        • Carlsen had never played this line before, even online, so it would be pretty amazing if Niemann had prepared it.

          He surely played it privately at some point. And, these days, would have plugged it into a computer to check the consequences. That is the weak point I would focus on.

          I mean, its even possible that neither player cheats. A third party (gambler?) hacks both PCs then subtelly influences what their preferred player practices based on what the opponent is practicing.

      • Anoneuoid –

        > I have noticed that I regularly do “impossible” things according to some people who use NHST* reasoning on this blog. It seems like magic to them.

        Funny.

        I’m curious which people using “NHST reasoning” you’re referring to there, and which things you “regularly do” that they think are impossible?

        Surely, you’re not referring to this:

        https://statmodeling.stat.columbia.edu/2021/09/16/wanna-bet-a-covid-19-example/#comment-2025148

        Although while I never thought doing what you did there would be impossible – being that wrong is a fairly difficult feat to achieve.

    • The measure of improvement that is being used relates to ELO increases. A player’s ELO changes as a function of their performance against other players, taking into account their strength. As such, this is a measure which, in a sense, inherently normalises across time to assess each player relative to their own competition. In other words, though you are correct that new technology makes it easier to play chess in an absolute sense (closer to a theoretic optimum), it does not make it easier to have a rapid rise in ELO.

      You are right though that statistics will only ever be suggestive here, not conclusive. But they certainly are suggestive.

      • In other words, though you are correct that new technology makes it easier to play chess in an absolute sense (closer to a theoretic optimum), it does not make it easier to have a rapid rise in ELO.

        Why would you assume people will improve following the same distribution today as they did in the past, while also acknowledging the way they learn is different? These are samples from very different populations.

    • ‘But in the end, the underlying issue is that chess is now a “solved problem”.’

      This is totally incorrect. Checkers is solved, tic-tac-toe is solves, chess is not solved.

        • It seems the people running computer chess competitions are already doing what I suggested:

          Q: How long can you maintain your targeted 65-80% draw-rate in the Superfinal when engines are continually improving?

          A: Draw abatement will definitely be more challenging as the competitors increase in strength in the years ahead. A mental comparison between today’s leading chess programs and those of ten years ago, combined with extrapolation into the future, points to a time where something will have to give. On the other hand, as long as neural nets and the best alpha-beta engines remain competitive with each other their contrasting play-styles may help keep draw-rates lower than they otherwise might have been.

          https://tcec-chess.com/articles/TCEC_Openings_FAQ.html

        • If you mean “making the game more complicated,” sort of? The rules of the game are the same and the openings are reachable positions.

        • If you mean “making the game more complicated,” sort of? The rules of the game are the same and the openings are reachable positions.

          From what I read, the engines will eventually start choosing Queen’s Gambit as white every time. The remaining problem to be solved is the response by black. But it is still possible that white can’t be beat if playing perfectly with that open.

          I guess another approach could be start measuring how many moves until a win/draw to determine skill.

  4. ”But either Niemann improved faster than any other chess player of all time, or he cheated.”

    This idea has been pointed out to be problematical because Niemann’s rating when he started was lower than the people he’s being compared to when they were first rated. So a big component of that improvement could have been an inaccurate rating. It seems Niemann’s rating at the end of that time was amazing, but not unknown.

    The chess.com folks/paper seem pretty clear that they don’t think that he cheated in OTB chess, i.e. that none of their computer results vs. moves actually played statistics indicate cheating in his OTB games. For example, Magnus Carlsen has a far higher ratio of engine recommended moves than Niemann.

    On the other hand, the browser behavior stuff is easy to get around: just play chess on your laptop and have an assistant run an engine on your desktop and report moves.

    • > The chess.com folks/paper seem pretty clear that they don’t think that he cheated in OTB chess,

      No, they are clear that they didn’t *detect* him cheating in OTB chess, because their methods do not work for this. Absence of evidence isn’t evidence of absence.

  5. “So a big component of that improvement could have been an inaccurate rating.”… I suppose this depends on what you mean by “inaccurate.” I don’t think anyone has suggested that his rating was mis-calculated when he was younger. He had some string of wins, losses, and draws against opponents with various ratings, and the result was a certain rating when he was 11 years old (for example). For players who have played enough games, ratings are highly correlated with ‘objective’ level of play as determined by the computer, so it’s clear that when he was 11 he had an ability that was extremely high compared to most chess players…but very low compared to the ratings of other top-rated players when _they_ were 11 years old.

    Another way to look at it: he’s at the elite level at age 19, and he climbed faster than anyone else in the period from age 11 to age 19, so, yes I agree with you, it’s a mathematical necessity that he had a lower rating at age 11 than those other elite players did. I’m not sure why this would make this line of reasoning “problematical.” Nobody else has reached as high a level at age 19 after being at such a _relatively_ low level at age 11. Somebody has to be first on this metric, maybe it’s him, could be…but since we know with near certainty that the guy has cheated like crazy in online events I think it make sense to be very, very suspicious of his OTB play.

    You say “The chess.com folks/paper seem pretty clear that they don’t think that he cheated in OTB chess, i.e. that none of their computer results vs. moves actually played statistics indicate cheating in his OTB games.” That’s not true at all. They do not say they don’t think he cheated in OTB chess. They say (among other things) “…we believe certain aspects of the September 4 game were suspicious, and Hans’ explanation of his win post-event added to our suspicion. As to his OTB play more generally, in Section VII below we discuss what we believe are apparent anomalies in Hans’ rise in OTB rating.” What they say is that they don’t have enough evidence about his OTB play to be sure that he was cheating. That’s very different from saying they don’t think he cheated.

    • “I suppose this depends on what you mean by “inaccurate.””

      Yes. The last time I checked, rating calculations limit delta ratings to 200 points. (That is, you get no (or a tiny number) of rating points for beating a much weaker player). So if you win all your games, and your opponents are all around 1600, your rating will be around 1800. No matter how many games, and even if you are playing 2100 against your PC. And 4th and 5th graders often have trouble finding anyone strong to play. My guess here is that Niemann has played against engines a lot more than other players have, and so his odd rating increase patterns are not necessarily odd. But this article disagrees with me.

      https://en.chessbase.com/post/tracking-a-player-s-progress

      You wrote: “That’s very different from saying they don’t think he cheated.”

      Well, Chess.com wrote: “Chess.com is unaware of any concrete evidence proving that Hans is cheating over the board or has ever cheated over the board.”

      “Nevertheless, and to be clear, it is not our position that Hans should be limited or banned from OTB chess.”

      It sounds to me that Chess.com is being very careful to NOT say that they think he cheated, which is, admittedly, different from saying that they don’t think he cheated. But they are saying that nothing that has been said rises above supposition, so they also CANNOT say (for logical as well as legal reasons) that they think he cheated.

      Whatever. I remain of the opinion that it’s unlikely that Niemann cheating over the board has a lot to do with his overall performance, since if he wasn’t a 2700 chess player, he’d have to cheat in every game. And that sounds unlikely.

      • “Whatever. I remain of the opinion that it’s unlikely that Niemann cheating over the board has a lot to do with his overall performance, since if he wasn’t a 2700 chess player, he’d have to cheat in every game. And that sounds unlikely.”

        It sounds like you think that, as long as Niemann is good enough to hang with great players, we should not worry about whether or not he gains a competitive advantage by cheating. That would be a surprising opinion. Most cases of cheating in high-level sporting competitions still require the cheat to be in the echelon of world class performers. Lance Armstrong was, quite clearly, a world-class cyclist. He was also a cheat. The relevant question when it comes to cheating is whether or not somebody is doing it to gain an advantage, not the proportion of somebody’s overall ability that is attributable to the cheating. Hans Niemann may be a great player, but he might also be a cheat.

        Or did I misunderstand you?

        • You misunderstand me: what I was saying there was that my best guess is that Niemann is a talented chess player.

          I don’t condone or accept cheating. At all. Niemann should be banned from OTB chess if he cheated at it, even once. But I’m also not happy with calling cheating without proof.

          And you seem to have missed my comment in another note to the effect that I think it would not be completely unreasonable to ban Niemann from OTB chess for cheating at online chess.

      • David,
        Yes, Chess.com is being very careful not to say they think he cheated. But that’s very different from your claim that they think he did not cheat.

        Here are two statements:
        1. “We are not sure Niemann cheated in OTB play.”
        2. “We are sure Niemann did not cheat in OTB play.”

        Chess.com is essentially making statement 1 but you said earlier that they were essentially making statement 2.

  6. A minor correction: The original post says “In both of those cases, Carlsen and Niemann were playing “over the board” or “OTB”, i.e. sitting across a chess board from each other and moving the pieces by hand. That’s in contrast to “online” chess, in which players compete by moving pieces on a virtual board.” The second tournament was hosted online with competitors connecting over Microsoft Teams. Video available on YouTube showing Carlsen disconnecting from Teams will quickly corroborate this fact.

    A recommended correction is to clarify the second tournament was an online tournament with anti-cheat measures in place. Sections 4-5 of the tournament format (https://chess24.com/tour/regulations/) provide vague regulations for supervision of participants in the tournament. The basic requirements are for participants to use a specified browser and a specified video connection. Arbiters may be physically or virtually present to monitor the players.

  7. Phil:

    The main message of your post seems to be that anybody can cheat in OTB by just sneaking a microwave oven into the tournament. But couldn’t this be checked with some sort of microwave detector? Not only that; I think there’s a risk that the spectators would hear the humming of the oven, and certainly they’d notice the bell when the food is ready, right? This reminds me of the crime stories where the killer times the shot so that it happens the exact same time as some scheduled firework going off.

    Also, I just thought of one more thing: Even assuming you could sneak in the microwave oven and nobody would notice it, you’d still have to plug it in. If your table isn’t right next to a wall outlet, you’re screwed. So maybe we could detect cheaters by looking at their seating positions at OTB tournaments and seeing if they’re seated next to the wall at a rate that’s statistically significantly greater than chance. Another way to detect would be if they are eating warm food in the middlegame or endgame. Any food they brought into the tournament would’ve cooled by then, but with the microwave oven . . .

  8. It really would help if someone could point to any historical incidence of OTB cheating by electronic signal and how it was done, or even some way that it could be done. Otherwise I am very dubious. Even if you have a sex toy on you, how do you receive a signal to move your bishop to B5? Morse code?

    There needs to be means, motive and opportunity. The only thing we have the slightest clue about is motive.

  9. Well, I guess it was just a matter of time before the Niemann cheating maybe-scandal arrived at this site. It’s a great whodunit (or didanyonedunit), right up our alley. And a lot of us play chess or used to.

    Me: I was a moderately successful tournament player (USCF master) when I was young, and a significantly stronger analyst. A number of things held me back as a player, the most important being my inability to prevent blunders, which seemed to occur randomly. They were like blackouts. In one game I hallucinated a rook. But I was pretty good at figuring out other people’s games, even at a GM level.

    Anyway, I think attempts to resolve this question by looking at the correspondence between Niemann’s OTB moves and engine suggestions are a dead end. This specifically applies to Ken Regan’s approach. Some issues, like the use of engines for only some key moves, can be adjusted for, but the killer is that commercial engines, which are still far stronger than humans, can be set at weaker levels for training purposes. Even noncommercial ones could be “trained down” or otherwise crippled, I suspect. Someone who doesn’t want to get caught and knows that Regan-type methods are used to detect cheating would definitely do this. (I’m reminded of the guy, many years ago, who ripped off an item from a big box store, discovered it was defective, and returned it to the same store for an exchange. This actually happened. Such people should be busted for being stupid as well as dishonest.)

    More to the point, in my opinion, is the evidence displayed at the Sigma Society website regarding the performance disparity between tournaments Niemann played in using digital boards that convey moves as they are played versus traditional boards, where players keep a scoresheet; see

    https://www.sigmasociety.net/news/eng-niemannxcarlsen?fbclid=IwAR2QkA8hkljpDzegMgyVNaSBVpdtrwp57jgMBTptFeWspwwYvm4VbT7FTB4

    I would ignore the nonparametric tests the author throws at the data; there are too few observations. Still, the pattern screams out. But are the data correct? I don’t know and would like a second opinion. Also, the visualization is poor, since no attention was given to the x-axis. An explicit timeline would help.

    My guess is that, if Niemann is cheating OTB, he has a confederate. He’s apparently quite a loner, so that should narrow the field. Going forward, I’d like to see high-level, high-stakes tournaments take place in shielded environments, which would probably mean no spectators on location beyond a few press. That’s no loss, however, since nearly all viewing these days is online.

    • Agreed, statistical methods of catching cheaters by comparing to computer play are never going to be infallible. They would work against a relatively dumb or unsophisticated cheater — not necessarily easy to come by among top grandmasters, and you would have to legitimately be a top grandmaster in order to pose as one. Someone earlier in the comments mentioned Lance Armstrong as an example of a top performer who cheated and I think that’s a decent analogy here. Perhaps the most surprising thing about the whole scandal _should_ be the strength of the evidence that Niemann cheated in all those online tournaments!

      As several grandmasters have pointed out, if you want to not get caught you wouldn’t play the engine moves all the time, you would just use the engine to stop you from playing bad moves. Ignore the engine unless you input your intended move and get a warning that it’s bad. And go ahead and make a bad move every now and then; you’d want to calibrate your level of play so that you still have a reasonable number of mistakes and losses. And, especially, don’t run the chess engine on the same computer where you’re playing the game, enabling a comparison of your level of play after you do and don’t click away from the window! Perhaps the level of ineptitude Niemann displayed in cheating online is a sort of evidence that he didn’t cheat OTB: could such a lousy cheater hope to get away with cheating OTB where it is presumably much, much harder?

      On the other hand, if he’s so great OTB then why did he need to cheat online?

      And I agree with this: “My guess is that, if Niemann is cheating OTB, he has a confederate. He’s apparently quite a loner, so that should narrow the field. Going forward, I’d like to see high-level, high-stakes tournaments take place in shielded environments, which would probably mean no spectators on location beyond a few press. That’s no loss, however, since nearly all viewing these days is online.”

      • Yes, he could just use the engine to avoid blunders. But how? Everyone is watching him. There are metal detectors. Broadcast is delayed. If Niemann is somehow cheating and getting away with it under all that scrutiny, he must be very clever, have some help, have very good equipment, etc. No one can figure out how he might be doing it.

        • Shoe computers have been used to cheat at chess in the past (and roulette, believe it or not!). You need a way to enter the moves and to receive a response. You’d want a way to re-do a mistaken entry, too. Chess is on an 8×8 grid, it’s not hard to think of an easy way for binary inputs to be used for this. And if all you want is for the computer to tell you that a move is bad, receiving the output is even easier.

          But I think they check the shoes nowadays. I’m not sure they check them well enough, though: the computer could be really tiny. People have run a chess engine on a computer with the area of a postage stamp and the width of a nickel. And there doesn’t have to be much metal in it, either.

Leave a Reply

Your email address will not be published. Required fields are marked *