AI, poker, and mind games, with Max Chiswick

Patrick McKenzie Nov 21st, 2024

Poker is a fascinating, multi-layered game, and solving it with AI is challenging.

This week on Complex Systems I'm joined by Max Chiswick. He was previously a professional poker player and has also done substantial academic and practical work on poker AI. I am an occasional player of the game (and did a podcast about this once) and thought I'd talk about it, some of the epiphenomena of the game, and what is new and exciting in poker AI.

[Patrick notes: I put after-the-conversation thoughts throughout the transcript, set out in this format. As long as I'm dropping you a note: guest and friend-of-the-program Byrne Hobart's book Boom just launched through Stripe Press. You can buy it at Amazon or wherever you buy books.]

[Patrick notes, in mid-January 2025: Max passed away a few weeks after this interview. May his memory be a blessing.]

Sponsor

This podcast is sponsored by Check, the leading payroll infrastructure provider and pioneer of embedded payroll. Check makes it easy for any SaaS platform to build a payroll business, and already powers 60+ popular platforms. Head to checkhq.com/complex and tell them patio11 sent you.

Timestamps

(00:00) Intro
(00:26) Max's background and journey into poker
(03:45) The credit card rewards game tangent
(06:12) Why poker matters: reasoning and decision-making
(07:49) The problem areas in the poker AI space
(09:38) Poker as an assistive technology for reasoning
(10:59) Online poker history
(16:14) Understanding multitabling
(21:14) Casino economics and gambling regulation
(22:55) Sponsor: Check
(26:32) PokerStars VIP program and professional incentives
(29:47) Playing a million hands in a month
(37:26) AI poker history and counterfactual regret minimization
(43:35) Poker complexity
(45:01) The impact of solvers on modern poker
(45:52) Understanding poker game theory and decision trees
(49:26) Recent developments in poker AI education
(50:27) Teaching programmers to build poker bots
(53:05) Wrap up and where to learn more

Transcript

Patrick McKenzie: Hideho everybody. My name is Patrick McKenzie, better known as patio11 on the Internets. I'm here with my buddy Max Chiswick.

Max Chiswick: Hi, Patrick. Great to be here.

Patrick McKenzie: Thanks very much for coming. Probably the most common thing I say at this point is that “You've had a long and winding career.” Why don't you tell people a little bit about it?

Max's background and journey into poker

Max Chiswick: I started playing poker quite a bit with friends in high school and was known for always having a deck of cards in my pocket. During college, I kept playing a lot of poker. Once I turned 18, I started playing online, which was during the big poker boom of the early 2000s.

After college, I graduated in 2008, which coincided with the financial crisis and the poker boom. That combination made the decision to play poker professionally full-time pretty easy for me. I played online poker, mostly on PokerStars, for about five years.

After that, I went to study poker AI as part of a master’s program at Technion in Israel. This was during the rise of AI in poker. While most of the significant research at the time was being done by Carnegie Mellon and the University of Alberta, I worked on a smaller project related to poker AI.

Later, I spent some time thinking about building educational tools around poker AI. One of my first projects was an AI poker tutorial that helped people understand AI through the lens of poker. It also included building simple poker bots for simplified games.

This year, I partnered with Ross to create educational programs, including an in-person event series called Poker Camp. We’ve run a couple of those sessions this year.

Patrick McKenzie: So, just give the audience the pitch for caring about poker. One reason is that when you take a large number of high-variance individuals and funnel them through the internet, the number of poker players you end up with seems statistically improbable.

I play poker recreationally, and I think it’s one of those weirdly attractive states that, quote-unquote, "people like us" often gravitate toward. So, I want to unpack that a bit. Partly to recognize the phenomenon, partly to highlight what it is about poker that pushes certain buttons for people like us, and partly as a bit of a warning. Not every attractive state people end up in is necessarily a good one to focus on.

The credit card rewards game tangent

Now, before we started recording, you mentioned that you were also deep into the credit card rewards game for a while, right?

Max Chiswick: That’s right. Yeah.

Patrick McKenzie: So, I’ll give my usual disclaimer about the credit card rewards game—and more broadly about anything financially motivated.

For context, I previously worked at Stripe, which is a credit card processing company. Stripe’s two largest expenses are: first, employees like me—or, well, former employees at this point—but they pay for smart people. And second, they pay money to banks in the form of interchange fees. Banks, in turn, convert much of that interchange into reward points for consumers. That’s the economic engine driving the credit card ecosystem, particularly in the United States.

A lot of people put a tremendous amount of time and mental effort into optimizing their economic lives for credit card points. Here’s the thing: it is very rarely the case that you can achieve a true economic edge over the financial system by simply understanding the credit card game better than they do. Sure, people spend countless hours trying to manufacture that edge, but most of the time, they’re essentially making themselves into poorly compensated members of a financial industry marketing department.

Speaking from experience—as someone who has worked in a financial industry marketing department—there are returns to specialization, intelligence, and effort in that game. But if you’re going to spend years of your life signing people up for credit cards, you’re much better off just getting an employee badge. You’ll get paid significantly more. [Patrick notes: I’d note that there is nothing particularly bad with this career choice, which I say simply because many people don’t think someone could honestly recommend working in the marketing department of a large credit card issuer. It’s an honest living! Many of the consumers are genuinely helped by the products you offer! This is disproportionately true for your most lucrative consumers, who—as you will swiftly learn if you are presently ignorant of this, like most civilians and regulators—are rich and sophisticated!]

That’s my exhortation—or maybe warning—for people wired similarly to me.

[Patrick notes: A narrow warning for a particular class of people who are also near and dear to my heart: if you run a small software business, burn your brainsweat on the business (especially on marketing and sales), not on optimizing the credit card rewards on your business’ expenses. Sure, getting ~2% on 40% of your revenue “for free” sounds fun, and you can justify maybe one hour of research total on getting something set up, but after that all additional thought is wasted. It will not feel like waste when you are reviewing statements and offers in year three, but it is wasteful nonetheless. Just go forth and sell more software.]

Max Chiswick: Yes, I wish someone had told me that about ten years ago. It might have worked out better for me.

Patrick McKenzie: I think there’s a life stage component to this as well. Like poker, credit card rewards, and even small software businesses often attract people very early in their adult—or in some cases, pre-adult—life stages.

[Patrick notes: One software entrepreneur whose name many people would recognize has recounted privately that they were running millions of dollars a month through a credit card while (failing) in college, doing a high-volume low-margin Internet marketing strategy. This is not a merit badge worth chasing, but I’ll again note how many people find themselves chasing it in our extended social circles. Note that even when your issuer is paying you tens of thousands of dollars a month in rewards you are wasting your time and talent running the business that makes that possible when you could simply run better businesses, as indeed that individual has gone on to do.]

At that point, they usually have few demands on their time, attention, or mental energy. They also tend to have very limited capital, and the idea of getting even a dollar out of the internet feels incredibly powerful to them—often more powerful than simply earning a dollar in a conventional way.

This idea comes up repeatedly in conversations I have with people: whether it’s about credit card rewards, online poker, or small software businesses. There’s this notion that the reward feels sweeter for being somewhat unauthorized—even if there’s nothing actually unauthorized about, say, selling software to willing customers. It deviates from the standard social script of “go to college, get a job at a company, and retire at age [insert typical retirement age here].” [Patrick notes: Noting that this is, in fact, a literally accurate transcript of what I said. Doesn’t everyone talk with block parentheses?]

That deviation pushes buttons in a way that a biweekly paycheck simply doesn’t.

But I’m diverging a bit from poker, which is the card game we’re here to talk about.

Why poker matters: reasoning and decision-making

On your website, you’ve sketched out a capsule summary of why people who aren’t engaged in poker should care about it, which I thought was great. Do you want to share that argument?

Max Chiswick: Sure. The big picture is that games like poker can make people better reasoners.

One part of this is learning to think in terms of expected value—making decisions based on probabilities and outcomes. Poker encourages quick mental math and helps people adopt a "thinking in bets" mindset, which is a rational way of approaching uncertainty.

Another aspect is emotional control. In poker, it’s just as important to keep emotions in check as it is to understand the math. For example, it’s not enough to know the optimal play; you also have to act on it, even if someone at the table made you angry in the previous hand.

It’s about staying focused, paying attention to details, and picking up on information. In online poker, this could mean analyzing statistical patterns, while in live games, it might involve interpreting subtle cues—like noticing someone breathing heavily, which could indicate they have a strong hand.

The problem areas in the poker AI space

On the AI side of poker, most research has historically focused on developing bots that play as close to game theory optimal (GTO) strategy as possible. This was the dominant goal from around 2007 to 2019. It’s a fascinating problem, essentially building the most mathematically sound AI.

But another intriguing challenge is designing agents that can exploit suboptimal opponents rather than simply playing optimally. This is where poker intersects with broader ideas in AI development. A simplified analogy would be a rock-paper-scissors setting. The optimal strategy is to play randomly, one-third rock, one-third paper, one-third scissors, but that’s not particularly interesting.

We’ve run rock-paper-scissors hackathons, and at first, people often say, “What’s the point? It’s a solved, boring game.” But if you know the competition includes non-optimal bots, it becomes a fascinating problem. You need to figure out your opponents’ patterns without letting them exploit your strategy.

Last year, DeepMind suggested rock-paper-scissors as a benchmark for AI development, which underscores how these kinds of problems remain relevant.

Poker touches on so many interesting areas—reasoning, AI, decision-making—that it’s valuable far beyond simply becoming a better poker player.

Poker as an assistive technology for reasoning

Patrick McKenzie: One of the things I find most fascinating about poker is how it serves as an assistive technology for forcing people to model other people’s minds. That’s a skill we all use frequently in real life—in both cooperative and competitive settings—but one that some people struggle with more than others. [Patrick notes: Raising my hand here, for the benefit of those people who have not already picked up a certain set of vibes from me.]

Poker is essentially a hyper-simplified version of this challenge. It’s a symmetric game in structure: all players are dealt cards from the same deck, share the same win/loss conditions [Patrick notes: actually an oversimplification and a crucial one to understand for professionals: some players are not at the table to maximize the number of chips they walk away with], and so on. But the game is deeply asymmetric in information. Players hold private knowledge about their hands, as well as private knowledge about themselves and their understanding of the other players at the table.

It forces people to consider questions like: When someone makes a move, are they doing it purely for their own benefit? Are they doing it to influence your perception of them? Or are they doing it to influence what you think they think you think about them? This concept of “levels” in poker—thinking about what others know, believe, or intend—can be transformational. [Patrick notes: If you’re new to this concept, the Internet abounds with primers.]

Even getting someone to grasp the idea that a person with a value system similar to their own might interpret the same facts in a completely different way can feel revolutionary, especially in a corporate setting. Poker builds this skill, though hopefully, it doesn’t require thousands of hours at the table to develop it.

Online poker history

But let’s get back to the history of poker. There’s this era known as the "Moneymaker Boom," which is fascinating. My version of the story is this: Every year, there’s an event called the World Series of Poker. In 2003, an amateur named Chris Moneymaker won the tournament. Combined with intensified marketing for online poker, this served as a catalyst that caused a flood of not-very-good recreational players to join newly accessible, 24/7 online poker sites.

From my perspective, the game of online poker became largely about attracting recreational players who would lose money to professional players, who in turn would profit and pay a lot of rake to the sites hosting the games. The poker companies’ main job became running marketing operations to continually bring in fresh recreational players for this ecosystem. [Patrick notes: And, as we will see in a moment, running or tapping into gray market payment ecosystems which were indispensable for operating their business of transferring money from “fish” to professionals for a large service fee.]

As a recreational player myself, I see it as paying for a service—entertainment—with the understanding that, at the end of the month, I’ll likely have lost money. I think I have a healthy relationship with poker in that sense, perhaps moreso than some others. [Patrick notes: It’s not terribly relevant to the conversation, but my poker budget annually is about $2,000, I mostly play tournaments in person because I find that fun and appealingly inaccessible, and in terms of skill level I’m well ahead of a Vegas $1/$2 game and probably at profitable-but-clearly-wasting-my-time at tournaments with a $100-250 buy-in. (“Probably” because, as you can infer from these numbers, I may just be experiencing positive variance sustained on a small sample size which happens to stretch a few calendar years.) If you don’t play poker, don’t worry about that, but it’s more useful to enthusiasts than the phrase “I am a moderately talented recreational player.”]

The boom came to an abrupt end with a U.S. crackdown on online gambling, culminating in an event called Black Friday. On that day, the Department of Justice simultaneously seized a number of poker sites. For years afterward, players faced uncertainty about accessing funds they had on deposit, either because the money was locked up by the government or had disappeared due to unscrupulous operators. Does that align with your understanding of the timeline?

Max Chiswick: Yeah, that’s a pretty good summary. I’ll add one major event that’s worth mentioning.

Chris Moneymaker won the World Series of Poker in 2003, marking the start of the boom. Then, in 2006, Congress passed a law called the Unlawful Internet Gambling Enforcement Act (UIGEA). It didn’t outright ban poker sites but made it much harder for them to operate in the U.S. As a result, publicly traded companies pulled out of the U.S. market, while some privately held operators continued, relying on legal opinions that it was permissible to do so.

Interestingly, UIGEA included a carve-out for fantasy sports, which laid the groundwork for the rise of daily fantasy sports sites. For context, younger people today might see sports betting ads everywhere and think it’s always been normal. But prior to 2006, sports betting was almost entirely banned, except in places like Las Vegas.

Poker complexity

After 2006, poker got a bit harder. It became more difficult to deposit money into the remaining sites, and some operators engaged in questionable practices—like misrepresenting themselves as a "golf shop" to process credit card payments. That was the start of the decline in the online poker boom.

Patrick McKenzie: One of the fascinating aspects of the policy bans on online poker is that they weren’t direct bans on poker itself. Instead, they targeted financial intermediaries, effectively banning payment processors from enabling transactions with poker sites.

This led to a wave of creative, if ethically questionable, solutions by operators whose primary goal was maximizing cash flow from U.S. players. These operators aimed to create as many avenues as possible for typical players to deposit money, often using methods that leveraged what was already in their wallets.

A common tactic involved setting up shell companies that would misrepresent themselves to banks as something innocuous, like a golf shop or an e-commerce retailer, while actually processing deposits for poker businesses. These setups would almost inevitably be shut down after a period of time, and poker companies worked to minimize the financial leakage to this money-laundering middle layer.

Over time, this evolved into an entire ecosystem of shadowy payment processors designed to stay one step ahead of regulatory crackdowns. Ironically—or perhaps not so ironically—many of the individuals who got their start in the gray side of payment processing for poker transitioned into similar roles for crypto exchanges between 2010 and 2020. For example, if you dig into the corporate history of Tether, you’ll find some familiar names from the poker boom era. But this isn’t a podcast about Tether, so I’ll leave that tangent aside.

Understanding multitabling

During this period, the online poker ecosystem also introduced a fundamentally different way of playing the game: multitabling. I understand you were involved in multitabling. Could you explain to the audience what that is and the mathematical implications it has for gameplay?

Max Chiswick: Sure. There’s multitabling, and then there’s capital-M multitabling.

Most people, when they play poker, find that playing a single table can feel a bit slow—especially in real life. Online, many players start by playing three or four tables at once. But some of us, myself included, took this to the extreme. I’d typically play at least 16 tables simultaneously and often maxed out the limit on PokerStars, which was 24 tables.

There were a couple of logistical setups for multitabling. One option was tiling, where you’d arrange multiple small tables across a single large monitor—say a 30-inch screen—or across several monitors, much like a stock trader’s multi-screen setup. Another option was stacking, where tables were normal-sized but layered on top of each other. In this setup, the active table—the one requiring your action—would pop up on top. As soon as you acted, the next table needing input would appear.

I personally preferred the tiling method because I liked being able to see all the action simultaneously. Others, however, liked stacking for its simplicity and focus.

Playing this way was inherently a quantity-over-quality approach. The goal was to maximize the volume of hands played, which was particularly advantageous when exploiting recreational players. Back then, most serious players used a heads-up display (HUD) to assist with decision-making.

A HUD would display key statistics next to each opponent’s name on the table. The most common stats included:

Voluntarily Put Money In Pot (VPIP): The percentage of hands a player voluntarily enters the pot.
3-bet percentage: How often a player re-raises pre-flop.

These stats, among others, helped players make decisions quickly when juggling dozens of tables.

Patrick McKenzie:

Patrick McKenzie: If I can digress for a moment for people who might not be poker fans, the percentage of hands a player participates in is deeply tied to why multitabling is so effective.

When played well, poker requires you to spend the vast majority of your session doing nothing—folding and staying out of hands. A skilled player might fold out of 80% or more of hands without ever making an interesting decision because the cards they are dealt simply aren’t good enough to warrant playing. [Patrick notes: One’s position at the table, stack size relative to other players, skill level relative to players likely to be in a hand, and whether e.g. one is in a cash game and tournament also feed into this calculation, but the cards themselves are the high order bit.]

For many people, that inactivity feels boring, and those who aren’t skilled or disciplined will compensate by playing too many hands. But when you play hands with objectively worse starting cards, you’re setting yourself up to lose money, especially when sitting at a table with competent opponents.

This is where multitabling becomes appealing. Knowing that 80% of hands are going to be automatic folds, skilled players can play at multiple tables simultaneously. This increases the volume of hands they see and ensures they make more interesting decisions during a session—all without compromising the quality of their play or the cards they choose to engage with.

Now, here’s where it gets even more strategic: there’s a clear discontinuity between disciplined players who can fold out of 80% of hands and undisciplined players who play 60%, 70%, or even 80% of hands. With just that one statistic—how often someone voluntarily enters a hand—you can often identify players who are not good at poker or who are choosing, for whatever reason, not to play well that evening.

From there, you can adjust your strategy to seek out opportunities to play against those individuals. Statistically, you’ll tend to have better starting hands than they do, and because they lack skill or discipline, you’re also more likely to outplay them as the hand progresses.

Max Chiswick: Yep.

Patrick McKenzie: This dynamic—targeting less skilled players—is a critical part of professional poker. Identifying those individuals and exploiting their weaknesses is the foundation of the game’s profitability for professionals.

One of the happy-sad truths about poker is that much of the edge and profit in the game comes from identifying people at the table who shouldn’t be playing poker—at least not at the stakes they’ve chosen—and aggressively capitalizing on their mistakes until they no longer have money to lose. [Patrick notes: Even the poker community, which plays negative-sum games for a living, recoils a bit from this conclusion, and so it is referred to as “table selection” as a bit of a euphemism.]

Then the hope is that the poker company will replace them as quickly as possible with another player who plays in a similar fashion.

Max Chiswick: Yes, this is very true. Most tables I played on—this was even over ten years ago—would typically have six players: five regulars, many of them professionals, and one less skilled player. If that player lost all their money, the table would often dissolve almost immediately. It was a very common situation.

Casino economics and gambling regulation

Patrick McKenzie: That reminds me of an interesting point about why poker is not a strategic priority for casinos, particularly in Vegas. I believe this was articulated by Steve Wynn, the entrepreneur behind a series of high-profile casinos.

His argument was that poker is one of the few games in a casino where someone can walk in, lose money, and the casino doesn’t directly benefit. Naturally, for casino operators—who are in the business of maximizing profits—that’s a deeply unappealing dynamic. Wynn reportedly found it almost morally offensive that someone could lose money in his establishment without it going into the casino’s coffers. He much preferred games where the house collects its edge directly, like slots or blackjack.

Despite this, poker persists in many casinos, but I think it’s maintained for a couple of reasons. First, it serves a marketing purpose. Poker rooms attract people who might not otherwise come to the casino, and the hope is they’ll spend money on more profitable activities, like slots, table games, dining, or entertainment. [Patrick notes: One of the best explanations I’ve ever heard of this dynamic was by the founder of an options brokerage, Tom Sosnoff at TastyTrade, explaining why he was willing to offer crypto at cost: he didn’t want the gamblers wandering across the street to another casino. I’d note that there is a continuum between options/futures brokers, crypto exchanges, and mobile sports books, and some firms benefit from pretending that they are simply tech-enabled regulated financial providers and are not meaningfully on that continuum.]

Second, poker has a cultural significance within the gambling world. It’s a game with a storied history, complete with its own rituals, heroes, and legends. Maintaining a poker room is, in some ways, an acknowledgment of that tradition—a way for casinos to say they value poker’s cultural importance, even if it’s far less profitable than slot machines.

Speaking of slots, those are the real breadwinners for U.S. casinos. They’re phenomenally profitable and, frankly, predatory. Slot machines are designed to maximize player losses and are deeply problematic in their methods—but that’s a digression for another time.

Max Chiswick: Yes, and I hear you can now get cash directly from the slot machines with your ATM card.

Patrick McKenzie: Oh boy. I could go on a long rant about the way gambling is regulated—or not regulated—in the United States. It’s such a patchwork system of what we consider morally odious versus what we allow, even when it’s clearly exploitative.

Allowing people to use ATM cards directly at slot machines is a prime example. Studies have shown that this maximizes the amount of money people lose. It’s a behavior we probably shouldn’t enable, yet here we are.

I used to attend a software industry conference in Vegas every year, though it’s since moved to a different city. That’s partly for good reasons, but it’s a little sad for me because it took away one of my rare opportunities to play poker.

One year, during that conference, I was negotiating an angel investment in a software company. As part of that process, I had to prove I was an accredited investor, which involved a phone call with my bank to verify everything and ensure I wasn’t being scammed for this $25,000 investment. [Patrick notes: Might have been $5,000, which was historically my usual check size, and matches my earlier retelling of the story.]

While I was on the phone, I was standing on the casino floor—next to a craps table and an ATM. That ATM would happily dispense $5,000 with no checks or balances, fully expecting you to take it to the craps table and lose it. Not only that, but there were people whose job was to serve you alcohol to impair your decision-making further.

It was a striking contrast. Here I was, going through a formal process with my bank to invest in a startup, while just steps away, the casino made it as easy as possible for someone to lose $5,000 on a whim. It’s an interesting microcosm of how the U.S. regulates—or doesn’t regulate—different forms of economic activity.

We heavily regulate capital formation for startups but have comparatively little oversight of gambling, even when it clearly preys on vulnerable individuals. It says a lot about our society’s priorities and moral intuitions.

Sorry for the rant—let’s get back to poker.

Max Chiswick: Okay, yeah. I think we’re seeing similar dynamics now in sports betting on mobile phones, but yes, back to poker. We were talking about multitabling.

PokerStars VIP program and professional incentives

PokerStars had an interesting setup with a very robust VIP program. It had multiple tiers, much like loyalty programs for airlines or hotels. If you reached their top tier, you could earn approximately $120,000 over the course of the year, just from the rewards.

This incentivized a lot of multitabling players—people who would play 12, 16, 20, or even 24 tables simultaneously. For some, the goal was simply to hit this VIP tier. They were satisfied even if they broke even or lost a little at the tables, as long as the rewards made up for it.

As a result, PokerStars became the site for quantity-over-quality players. But this also created an equilibrium where some very strong players, who focused more on quality, could exploit both recreational players and multitabling grinders like me. Since we couldn’t devote full attention to each decision, we were more vulnerable to their strategies.

Patrick McKenzie: That’s a fascinating dynamic. There’s an interesting economic relationship between poker professionals and the platforms or casinos hosting the games.

In a casino setting, the dealer is a direct employee of the casino. They get a salary, sometimes augmented by tips from players. But poker players themselves aren’t employees of the casino—they’re essentially independent contractors providing entertainment services.

This is critical for poker’s business model. If one person shows up to play poker on a Friday afternoon, no game happens. You need at least six people to start a game and get the rake flowing—the rake being the primary way the casino profits from poker, to the extent they profit directly.

For online platforms like PokerStars, the dynamics are slightly different. They subsidize liquidity by incentivizing professionals to sit at as many tables as possible, across all stakes and game types, at all hours. This ensures a wide variety of games are always running, which attracts recreational players and keeps the ecosystem healthy.

But the result of this system is that some players end up playing an absurd number of hands. In fact, they play more in a year than anyone likely played in their entire life before online poker existed. This volume has profound implications for the game itself.

Playing a million hands in a month

As I recall, you played a statistically improbable amount of poker at one point.

Max Chiswick: Yes. The first full year I played poker professionally was 2009, and I decided two things were important: first, reaching the top VIP status on PokerStars, and second, going beyond that to play a significant volume of hands.

I calculated that if I played consistent hours throughout the year, it would come out to a reasonable schedule—something like 30 hours a week. Now, poker is a mentally taxing activity, especially multitabling, so 30 hours feels more exhausting than 30 hours at most other jobs.

Unfortunately, by June, I had only played about 300,000 hands—roughly a tenth of what I needed. To motivate myself, I made a bet with two friends. I even gave them 3-to-1 odds because I figured it would increase my accountability. For every $10 I wagered, they could win $30 if I failed, and I could win $10 if I succeeded. The stakes of the bet were quite large because I needed strong incentives.

By the end of September, I had played about one million hands—still far behind schedule. I had recently moved to Chicago and initially created a well-balanced schedule on a spreadsheet that included gym time and recreational activities. But by October, all non-poker activities had been canceled.

At that point, I still needed to play two million hands before year-end. I calculated that to catch up, I’d need to play 12 hours a day in October, but by December, that number climbed to an unsustainable 18 hours a day. To make up for the shortfall, I started playing more than 24 tables at a time, adding another poker site to increase my volume to around 30 tables simultaneously.

This led to a lot of memorable and suboptimal moments—like folding incredibly strong hands, including the best possible hands, in large pots simply because I didn’t notice the table.

In December, I hired a friend to act as my assistant. Their job included waking me up and ensuring I didn’t hide in the bathroom to avoid playing. As the grind intensified, I brought in my mom to help keep me awake during those marathon sessions. Thanks to her—and an insane amount of caffeine—I finished my goal just before New Year’s.

In the final 10 days, I was playing more than 18 hours a day, averaging 17 hours a day for the entire month of December. I played roughly 990,000 hands that month, which, as far as I know, is the most hands ever played in a single month outside of someone playing microstakes like $0.01 games.

I did win the bet, but I was so sleep-deprived that I lost a significant amount of money in the final stretch, wiping out much of what I gained from the wager. Still, the experience pushed me to play an extraordinary volume of hands, which was a satisfying, albeit exhausting, accomplishment.

Patrick McKenzie: Yep. Let’s take a moment to unpack some takeaways from your experience.

First, for people who’ve never played live poker, my rough estimate is that at a well-run table in a place like Vegas, you’ll go through about 20 hands per hour. If you imagine a poker professional logging 2,000 hours a year over a 10- or 20-year career, you can calculate how many hands they’ve played in total. [Patrick notes: I avoid doing mental math when speaking, but on these assumptions, that is 400k-800k hands.]

But then you consider serious online players like yourself, who can play that volume in just a single month. That’s a staggering difference, and it’s one of the factors that has made the game significantly harder in both online and offline environments in recent years.

Another factor contributing to the steepening skill curve is the increasing use of solvers and theoretical advancements in poker strategy. We’ll dive into that in a moment.

On a different note, your December grind highlights another topic: the effects of sleep deprivation on decision-making. We used to have robust research on this, particularly in the context of military performance during World War II and the Korean War. But research has tapered off in recent decades, partly because the effects are so conclusively negative that it’s difficult to get experiments past institutional review boards.

I’d love to see someone creatively disaggregate the effects of poor sleep, chronic sleep deprivation, and other poker-specific stressors on decision-making. Even better, it’d be fascinating to chart these effects against players’ self-reported perceptions of their own performance. One key finding in the sleep research literature is that metacognition declines along with cognitive ability: people think they’re performing well when, in reality, they’re making significantly worse decisions.

Not throwing you under the bus here—I’ve spent plenty of my adult life in impaired states due to overwork and poor sleep habits. But I think it’s an important point to highlight for the broader audience.

Max Chiswick: Yep.

Patrick McKenzie: Definitely. Let’s move on to Poker AI. Can you give us a quick overview of what’s been happening in that space over the last 10 or so years?

AI poker history and counterfactual regret minimization

Max Chiswick: Sure. The current revolution in Poker AI started with a breakthrough algorithm developed in 2007 at the University of Alberta called counterfactual regret minimization (CFR).

CFR is a self-play algorithm. Essentially, the AI plays against itself repeatedly. It starts with a completely random strategy, but over many iterations, it converges toward an equilibrium strategy for one-on-one games.

While this algorithm works well for very small games, larger games—like Texas Hold’em—are far too complex to apply it directly to the full game space. From 2007 to about 2017, a lot of research focused on finding ways to abstract and simplify one-on-one poker games to make them tractable for CFR-based approaches.

Patrick McKenzie: Just to jump in for those who might not have a background in AI, I have an increasingly out-of-date undergraduate concentration in the field. Without diving too deeply into the rabbit hole of AI and games, it’s important to understand that some games are mechanically far more complex than others.

For example, Go is orders of magnitude more complex than Chess in terms of the number of possible states the game can be in. This difference in complexity has significant implications for how AI algorithms are developed to play these games.

One common approach in AI research over the years has been to simplify games—essentially stripping out some of their features—to reduce the "state space," or the total number of possible game states. This simplification makes it more feasible to apply statistical or computational methods within the hardware and budget constraints typically available to researchers.

In poker, for instance, a simplified version might involve limiting the game to two players (heads-up poker) instead of the usual six or ten. Another common simplification might involve reducing the deck to just a subset of cards—for example, only the tens and face cards—making the game far less complex in terms of possible scenarios.

These adjustments allow researchers to run their algorithms on more affordable hardware while still advancing the state of AI in the game. These simplified variants serve as stepping stones, providing insights that can eventually be applied to more complex versions of the game.

[Patrick notes: I’d note that the space of all possible games is infinite but certain culturally significant points in that space are much more valuable than other points. Texas Hold’Em is seen as an attractive research target both because a) money and b) it’s the poker variant that people think of when you say “poker.” Chess is chess; chess without castling, while computationally slightly easier, just isn’t the same. You’ll get more citations for chess than you will for created-for-research-purposes variants, you’ll attract more and better grad students to your banner, etc.

This is probably more important than people realize. Sometimes beating world experts at simplified versions of Dota 2 is, in fact, not the final, defining accomplishment of an AI research organization.]

Max Chiswick: Yeah, one of the milestone papers in Poker AI came in 2015, again from the University of Alberta. They approximately solved the full game of Limit Texas Hold’em. This is a variation of Texas Hold’em where betting is restricted to fixed increments, making it a smaller and less complex game compared to No Limit Texas Hold’em, which is the more popular variant you see in casinos.

Following that, there were some high-profile machine-versus-human competitions. Most of these featured AI programs developed at Carnegie Mellon University.

The first major competition was in 2016, where the humans, including top professionals, actually defeated the bot, Claudico. But the following year, in 2017, a bot named Libratus played against some of the world’s best players—like Doug Polk—and won decisively. That marked a significant milestone.

As with many AI breakthroughs, the consensus before this happened was that it would take much longer to achieve. One of the reasons poker was considered a harder problem than games like chess is the hidden information. In poker, each player has private cards, meaning decisions must account for incomplete knowledge. This makes it a more complex challenge than perfect-information games like chess or Go, where both players have full visibility of the game state.

Patrick McKenzie: There’s also a romanticization of the human element in poker, I think. When great players talk about their process, they often narrate how they get into their opponent’s head, construct mental models of them, and even “soul-read” their cards based on tells or patterns.

I won’t claim that computers can’t do this, but when you read descriptions of how these algorithms functioned back in the day, it’s clear they weren’t soul-reading. There wasn’t some subroutine dedicated to interpreting tells or reading minds. The algorithms were essentially performing vast amounts of rapid mathematical computations—crushing their opponents with sheer computational power rather than psychological insight.

That’s my understanding, at least from the papers I browsed at the time. Would you agree?

Max Chiswick: Yeah, for example, bots like Libratus were bluffing and even popularized concepts like overbetting—making a bet larger than the size of the current pot. When I played, it was extremely common to bet between 50% and 75% of the pot. Overbetting wasn’t part of the human repertoire at the time, but the bot showed that applying pressure with these large bets was effective.

And as you mentioned, this wasn’t some subroutine to intimidate the humans; it was simply what the algorithm learned was optimal through its calculations.

Patrick McKenzie: That’s fascinating. Poker is inherently a social game with a community built around it—complete with rules of thumb, superstitions, and rituals. What’s interesting about the interaction between humans and AI in poker is how strategies developed by bots have leaked back into human play.

Overbetting is now a standard part of the toolbox for good players, even though it wasn’t before these algorithms demonstrated its effectiveness. There’s been a significant shift in poker strategy, particularly with the introduction of solvers. Can you talk a bit about the evolution from AI algorithms to solvers and how they’ve changed the way humans approach the game?

The impact of solvers on modern poker

Max Chiswick: Sure. Let me quickly step back and fill in some timeline context.

In 2011, we had Black Friday, when most major online poker sites left the U.S. market. This was a major inflection point because recreational American players—who weren’t generally very skilled—could no longer participate. Their departure made the remaining player pool tougher.

Then, over the next few years, the first commercially available solver programs appeared. These were initially tools for very serious players. Some would rent servers to run complex solver analyses, which allowed them to simulate optimal strategies for various situations. Players would then compile this data into spreadsheets and use it as a study resource.

More recently, tools like GTO Wizard have streamlined this process. Instead of relying on slow, local computations or rented servers, players can now access a website that quickly runs solves for a given situation.

When we say "solver," we’re referring to software that calculates the game theory optimal (GTO) strategy for a specific poker scenario. For instance, you can input details about a hand—your cards, the board, stack sizes, and so on—and the solver will output the mathematically optimal strategy for that situation.

These tools have become the primary way most serious players study poker today. The top players now aim to understand GTO strategies as closely as possible. While they sometimes deviate from those strategies to exploit specific weaknesses in their opponents, GTO provides the baseline for decision-making.

Understanding poker game theory and decision trees

Patrick McKenzie: For those who aren’t familiar with the flow of a poker hand, let me offer a brief background.

At the start of a hand, we’re in one of a very large number of possible world states. Over the course of the hand, as cards are revealed and players make decisions, we gain progressively more information about the likely state of the game. This shared pool of revealed cards narrows down the possibilities, and the decisions players make further reduce the set of probable world states.

For example, if you’re evaluating your position during the hand, you might conclude that in the commanding majority of possible world states, you’re currently behind. At first glance, this might suggest that the best strategy is to minimize your losses and exit the hand as cheaply as possible.

But poker is much richer than that. Even if you can’t win with your cards, you might still win the hand by convincing your opponent that they are behind. This can be done, for instance, by betting aggressively and representing a stronger hand than you actually have.

The key is understanding how your actions fit into the narrative of the game. In what proportion of possible world states are your actions consistent with you having a much stronger hand than you actually do? Additionally, given the limited number of future actions you can take in the hand, how many "bits" of additional information can you strategically deliver to reinforce that narrative?

The goal is to construct a story compelling enough that your opponent takes the action you want—folding, calling, or raising—based on their perception of the range of hands you might plausibly hold and their own evaluation of their position in those world states.

Hopefully, that explanation captures both the truth of the game and provides a sense of just how fascinatingly rich poker is as a technical, statistical, and game-theoretical problem. People aren’t drawn back to poker for nothing—it’s a complex puzzle that rewards deep thought and strategic ingenuity.

Max Chiswick: Yeah, I think that was a great explanation.

Patrick McKenzie: There’s another interesting angle worth discussing: the interaction between poker sites and poker bots.

In the late 2000s, some researchers, particularly grad students, explored the idea of creating bots capable of playing poker better than humans. The appeal was clear—such bots could theoretically generate substantial profits by operating on poker sites.

But from the perspective of a poker site, there’s a balancing act. Their business model depends on extracting money from recreational players while maintaining the illusion that even unskilled players have a chance to win. If recreational players believe they have no shot, fewer of them will return, and the ecosystem collapses.

This led to an arms race between bot developers and poker sites. The sites implemented restrictions to limit the technological support players could use, trying to ensure games remained fair—or at least felt fair. It’s a fascinating commentary on how technology influences the social dynamics of games and commerce.

Interestingly, the conversation around tech-enhanced play has shifted over time. Today, even in live games, players seem to expect a level of technological support—tools like solvers and pre-game preparation—but this normalization of tech use has evolved alongside the game itself. It’s a classic case of how we shape our tools, and then our tools shape us and our norms.

As a side note, since we’ve been discussing statistical approaches to poker AI, I’m curious: given the broader AI revolution happening right now, especially with large language models (LLMs), has there been any meaningful cross-pollination between LLMs and poker AI?

Recent developments in poker AI education

Max Chiswick: That’s a good question. From what I’ve seen, there hasn’t been much crossover between LLMs and poker AI so far.

As I mentioned, by 2017, poker bots had defeated humans in one-on-one games. By 2019, bots were also beating humans in multiplayer formats, such as six-handed games. After these milestones, interest in poker AI seemed to decline somewhat—at least in terms of high-profile research or breakthroughs.

Modern AI techniques, including LLMs, could offer interesting new directions for poker AI, but I haven’t seen much evidence of that happening yet. One area that remains underdeveloped is opponent modeling. Most applications so far have focused on simplified scenarios, like learning how a static opponent plays. The potential to use advanced AI for more dynamic, real-time opponent modeling is exciting, but it hasn’t been fully explored yet.

Teaching programmers to build poker bots

Patrick McKenzie: So, what did you learn from running a course for programmers attempting to build poker bots?

Max Chiswick: Right. This summer, we ran a course called AI Poker Lab. The premise was that each week, we focused on a progressively more complex simplified poker game. We started with Kuhn Poker—a very basic game often used in academic settings—and worked our way up to Royal Hold’em, which is a scaled-down version of Texas Hold’em.

Each week, participants submitted a bot for the specified game, and we ran a round-robin competition where all the bots played against one another. One key challenge we identified was that it wasn’t quite satisfying or engaging enough for students to jump directly into a live competition without intermediate steps.

To address this, we introduced a progression of bots for participants to beat before entering the main competition. These pre-built bots varied in strength, giving students a stepping stone to develop and refine their strategies. This structure made the experience more rewarding and provided clear milestones to achieve.

We also introduced some of our own flawed or suboptimal bots into the main competition. This added a new layer of complexity—students not only had to optimize their bots for general play but also strategize to exploit the weaknesses in these suboptimal bots.

If the competition had been purely student vs. student, the winning strategy would often default to building the best Nash equilibrium bot—essentially whoever implemented the algorithm most effectively. But by adding exploitable bots, we encouraged students to think critically about problem-solving, strategy, and adaptability, rather than just chasing mathematical perfection.

Patrick McKenzie: That’s a smart approach. We did something similar during the Starfighter experience. There’s a meta-layer to consider in these types of educational setups: what you want to incentivize students to focus on, and the social dynamics within the course.

Poker, as you said, is inherently a player-vs-player game. But if you turn it into a strictly student-vs-student experience, it can create negative dynamics—students might be less inclined to share ideas or collaborate, and the environment becomes hyper-competitive in ways that aren’t always pedagogically beneficial.

By introducing what gamers might call a player vs. environment (PVE) component into the player vs. player (PVP) structure, you give students the chance to collaborate. For example, they can “share their cards,” metaphorically speaking, by discussing their approaches, trading insights, or even helping each other debug issues.

This fosters a richer learning environment, where students are teaching each other and exchanging information freely. It’s a more collaborative and engaging experience than a purely adversarial one, while still preserving the competitive aspects of the course. Did you find that dynamic worked well in AI Poker Lab?

Max: Yep, exactly.

Patrick: Cool. So Max, this has been an excellent conversation on poker, AI, and some related digressions. You've written about this quite extensively on the internet.

Where can people find out more about this topic?

Max: Thank you. Yeah, it was great to be here. I am now working with a co-founder on a couple of things. So we have built overbet.ai, which has a few experiments in modeling poker computationally, and more to come. Separately, we have the Expected Value Foundation. That site has information about the poker camp courses that we ran, future ones, and some other non-profit initiatives about teaching through games.

My personal website is maxchiswick.com.

Patrick: Thanks very much.

Max: Thank you.

podcast