
(SeaPRwire) – Large language models (LLMs) struggle significantly when it comes to playing chess.
Despite this, as a three-time National Chess Champion and two-time U.S. Women’s Chess Champion, I enjoy playing against them. I do not do this to challenge myself to my absolute best, but rather to understand what these interactions reveal about human nature.
Engaging in chess with LLMs has provided insights into the unique creativity and diversity of humans, our susceptibility to flattery and sycophancy, and how artificial intelligence is starting to influence human behavior.
LLMs are not actually designed to excel at chess. Their primary function is to predict the most probable next word and to flatter the user. Consequently, AI-driven chess programs are not attempting to defeat you; they aim to keep you engaged. However, we can derive valuable lessons from their surprisingly poor gameplay, extending beyond the board or the tokens.
Superhuman chess programs, such as the one that defeated Garry Kasparov three decades ago or DeepMind’s “AlphaZero,” can consistently outplay any human. However, most people no longer play against these top-tier computers because the outcome is predetermined. While repeated defeats offer limited learning opportunities, experimenting with LLMs can be an exhilarating experience.
When I first challenged ChatGPT4 to a game, it performed adequately, but I secured a strong position after 15 moves and captured a knight. As my advantage grew, the AI hallucinated a phantom piece to capture my queen. In essence, it cheated! Initially, this seemed puzzling, given that off-the-shelf LLMs are typically associated with sycophancy rather than theft.
Therefore, I began playing my worst possible moves against ChatGPT. It bent the rules once more, but this time in my favor, replacing the pieces I had blundered with phantom pieces. Regardless of whether I played better or worse than the AI, it adjusted the board to make us equal. While it wasn’t always cheating, it was always confabulating. Humans confabulate by filling memory gaps with logical sequences, and ChatGPT was doing the same.
I have observed that LLM hallucinations are more frequent when attempting “long moves” that span the entire board, which mirrors their difficulty with extended conversations.
During a tournament hosted by Google featuring top LLMs, 42 out of 47 games utilized the Sicilian Defense, a strategy favored by Bobby Fischer and the fictional Beth Harmon from *The Queen’s Gambit*. Why this preference? Because it is the most common opening. Recent research by DeepMind found a similar phenomenon when attempting to generate creative, aesthetically pleasing, and counterintuitive chess positions. The researchers discovered that AI tends to “collapse” into repetitive themes and patterns it deems “beautiful.”
In DeepMind’s chess beauty program, researchers managed to reduce this repetition by explicitly programming for greater diversity. However, even with extensive training data, probabilistic outputs, and diversity filters, it remains difficult to replicate the vast variation and range of human thought.
It is worth noting that LLMs and AI, in general, are not the only technologies struggling to capture the diversity of the human experience. Consider the algorithmic, winner-take-all dynamics of social media, where conforming to the average user’s preferences generates more clicks, attention, and revenue. To avoid succumbing to a monolithic voice and monoculture, we must actively seek diversity in our sources, prompts, and inputs. As Haruki Murakami noted, “If you only read the books that everyone else is reading, you can only think what everyone else is thinking.”
Much like chess engines, LLMs will continue to improve, necessitating our preparation for that future. Chess has grappled with maintaining fairness against superhuman AI for decades. Although electronic devices have been banned in competitions for a long time, this has not prevented cheating from disrupting the sport.
In what is arguably the most significant chess cheating scandal to date, the world number one, Magnus Carlsen, lost to 19-year-old Grandmaster Hans Niemann in 2022. Carlsen withdrew from the tournament, and it was later revealed that Niemann had cheated in previous online games. Although there was no evidence to suggest Niemann cheated against Magnus, bizarre theories went viral, including one suggesting anal beads were used to transmit moves via AI. Since then, live broadcasts have introduced time delays and increased surveillance. Despite these measures, accusations and scandals remain frequent; some are valid, while others lack evidence but are amplified by drama-seeking social media algorithms and heightened fears of AI-based cheating.
This situation teaches us that developing advanced cheating detection tools will be insufficient in an AI-driven future. Instead, we must cultivate trust and integrity within our communities, a task that AI cannot accomplish for us.
It also reminds us that we cannot be naive regarding the complexities of our AI-driven future; rather, we must find constructive ways to utilize AI.
Chess players have become adept at calibrating AI usage for training and preparation, reviewing our own games and those of opponents. The ideal approach is to expand and refine our move options without losing the ability to think independently. I prefer the “sandwich method”: I start with my own analysis (the bread), consult the AI’s perspective (the tuna fish), and then return to my own reasoning to synthesize the takeaways.
LLMs have a dual nature: they can enhance our sharpness and intelligence, or they can make us duller and more average, dependent on a computer for every thought. By playing chess against LLMs, we can better understand their strengths and limitations as coaches or confidantes, allowing us to know when to say, “Goodnight Gemini.”
This article is provided by a third-party content provider. SeaPRwire (https://www.seaprwire.com/) makes no warranties or representations regarding its content.
Category: Top News, Daily News
SeaPRwire provides global press release distribution services for companies and organizations, covering more than 6,500 media outlets, 86,000 editors and journalists, and over 3.5 million end-user desktop and mobile apps. SeaPRwire supports multilingual press release distribution in English, Japanese, German, Korean, French, Russian, Indonesian, Malay, Vietnamese, Chinese, and more.