Variant No. 001 · Findings — preliminary Reviewed by Jesse

First numbers
are in.

A first batch of simulated hands has now been logged. The heuristic-only runs — the five fixed-strategy archetypes plus the Adaptive agent, with no learned policy yet — already produce what looks to me like a clear picture of the game's structural pull toward Low. The trained agent is still in preparation; what follows is what the heuristic ecology alone tells me.

The findings below are my own interpretations of simulation output. The numbers themselves are mechanical and reproducible from seeds; the readings on top of them — what each finding "means," whether a Piggy declaration is structurally unprofitable, which behaviours a trained agent ought to learn — are mine. The other regulars at the table, who have been playing the game far longer than I have, may disagree.

Run conditions

Hands simulated: 2,000 at a 7-seat table. Lineup: one Adaptive, two High-Seekers, two Low-Hoarders, one Piggy-Hunter, one Denialist. Stack: 500 tokens per seat per hand (each hand starts fresh). Seeds: fixed and reproducible. Compute: ~3 hands/sec, single-machine Python. These findings should be considered preliminary — the sample is large enough to see effects but small enough that we expect 5–10% wobble in EV numbers between seeds.


Finding 1. Low-Hoarder wins the table by a wide margin

Across both variants the Low-Hoarder is the only archetype with positive expected value. Everyone else loses on net — the Low-Hoarders are paying the bills for the rest of the table.

Archetype EV per hand (cents) Notes
Low-Hoarder+106.45 Only positive style on the table
Denialist−18.13 Defensive — least bleeding overall
High-Seeker−26.11 Competes with itself for the high pot
Adaptive−49.93 Hurt by occasional Piggy declarations
Piggy-Hunter−96.22 Busts on 94% of Piggy declarations

The spread between best and worst archetype is roughly 200 cents per hand — meaningful at the table's actual stakes (5¢ antes, 10¢ betting increments, $5 buy-in). Six betting rounds let losers commit a lot of chips to losing hands, and a Low-Hoarder against a Piggy-Hunter is, mechanically, a generous arrangement for the Low-Hoarder.


Finding 2. The six-four is the modal low

Of all clean (no-pair) low hands that reach showdown, just over 80% are A–2–3–4–6 — the six-four. The next-best class — one off, meaning the hand contains exactly four of the five six-four ranks and one substitute — is another 17%. Everything else is rare.

Distance from the six-fourFrequency
0 (the six-four exactly)80.6%
1 off16.8%
2 off2.6%
3+ off, or paired/straight/flush<0.1%

This is the empirical confirmation of the safe-Low player's central insight. With twelve native wilds in the deck and four passing rounds to acquire them, a player who actively builds toward the six-four has roughly an 80% chance of arriving exactly there by showdown. The implication is twofold:


How the six-four gets built — three traces

The aggregate frequencies above are easier to read once you see what a hand actually looks like as it walks through the four passes. Below are three Low-Hoarder hands — pulled directly from the simulator, with seeds — chosen to cover the spectrum from a weak starting position (only one of the six-four ranks at the deal) to a strong one (four of the five already in hand before the first pass).

Cards are colour-coded: red for hearts and diamonds, black for clubs and spades, and inverted for the native wild cards (7s, 8s, and 9s). The "X of 5 six-four ranks" counter is the number of distinct target ranks (A, 2, 3, 4, 6) currently in the hand. Wilds are counted separately because they can stand in for whichever six-four rank the hand is missing at showdown.

What to watch for

The strategic point of the passing phase is to never need more than one wild per missing six-four rank. The Low-Hoarder ships face cards in the first pass (when four cards move) and progressively narrows down toward the target set. By the end of pass 1 (the one-card swap), the hand has usually converged — even when, as in Hand 1 below, the starting position looked nothing like a low hand.

Hand 1. weak start — 1/5 six-four ranks + 3 wilds at the deal

Almost none of the target ranks present, but three native wilds (the two 7s and the 9) give the hand all the flexibility it needs. Wilds will eventually stand in for whatever six-four ranks the passes fail to deliver.

Dealt
5♣ 7♣ 9♠ Q♣ 7♦ 4♠ T♣
1 of 5 six-four ranks · 3 wilds
Pass 4 to seat 3 · from seat 2
Out 5♣ Q♣ T♣ 7♣
In 5♦ K♥ 6♠ 8♥
After
9♠ 7♦ 4♠ 5♦ K♥ 6♠ 8♥
2 of 5 six-four ranks · 3 wilds
Pass 3 to seat 2 · from seat 3
Out 5♦ K♥ 6♠
In 5♣ T♣ Q♣
After
9♠ 7♦ 4♠ 8♥ 5♣ T♣ Q♣
1 of 5 six-four ranks · 3 wilds
Pass 2 to seat 1 · from seat 4
Out 5♣ Q♣
In 3♦ 4♥
After
9♠ 7♦ 4♠ 8♥ T♣ 3♦ 4♥
2 of 5 six-four ranks · 3 wilds
Pass 1 to seat 0 · from seat 5
Out T♣
In 5♥
After
9♠ 7♦ 4♠ 8♥ 3♦ 4♥ 5♥
2 of 5 six-four ranks · 3 wilds
Final
9♠ 7♦ 4♠ 8♥ 3♦ 4♥ 5♥
best low resolves to A–2–3–4–6

Result: won (or split) the low pot at showdown.

Hand 2. mid start — 2/5 six-four ranks + 0 wilds at the deal

Two of the five six-four ranks already on board and zero wilds: a typical realistic starting position for a Low-Hoarder. The work of the four-pass sequence is to acquire the missing ranks and (ideally) some wilds, while shedding the face cards as fast as possible.

Dealt
4♦ Q♥ K♥ T♦ K♦ J♣ 3♠
2 of 5 six-four ranks · 0 wilds
Pass 4 to seat 3 · from seat 2
Out K♥ K♦ Q♥ J♣
In T♠ 8♠ 7♦ 8♦
After
4♦ T♦ 3♠ T♠ 8♠ 7♦ 8♦
2 of 5 six-four ranks · 3 wilds
Pass 3 to seat 2 · from seat 3
Out T♦ T♠ 8♠
In J♣ K♥ K♦
After
4♦ 3♠ 7♦ 8♦ J♣ K♥ K♦
2 of 5 six-four ranks · 2 wilds
Pass 2 to seat 1 · from seat 4
Out K♥ K♦
In 2♣ 3♣
After
4♦ 3♠ 7♦ 8♦ J♣ 2♣ 3♣
3 of 5 six-four ranks · 2 wilds
Pass 1 to seat 0 · from seat 5
Out J♣
In 5♥
After
4♦ 3♠ 7♦ 8♦ 2♣ 3♣ 5♥
3 of 5 six-four ranks · 2 wilds
Final
4♦ 3♠ 7♦ 8♦ 2♣ 3♣ 5♥
best low resolves to A–2–3–4–6

Result: won (or split) the low pot at showdown.

Hand 3. strong start — 4/5 six-four ranks + 2 wilds at the deal

An almost-finished hand at the deal: four of the five six-four ranks are present and there are two native wilds in hand already. Passing here is about consolidation and not giving the opponents any signal about what's coming.

Dealt
2♥ 7♥ A♠ T♣ 3♣ 9♥ 6♣
4 of 5 six-four ranks · 2 wilds
Pass 4 to seat 3 · from seat 2
Out T♣ 6♣ 7♥ 9♥
In 5♦ 5♠ 5♥ K♣
After
2♥ A♠ 3♣ 5♦ 5♠ 5♥ K♣
3 of 5 six-four ranks · 0 wilds
Pass 3 to seat 2 · from seat 3
Out 5♦ 5♠ 5♥
In T♣ Q♦ 6♣
After
2♥ A♠ 3♣ K♣ T♣ Q♦ 6♣
4 of 5 six-four ranks · 0 wilds
Pass 2 to seat 1 · from seat 4
Out K♣ Q♦
In 4♠ 4♦
After
2♥ A♠ 3♣ T♣ 6♣ 4♠ 4♦
5 of 5 six-four ranks · 0 wilds
Pass 1 to seat 0 · from seat 5
Out T♣
In 2♣
After
2♥ A♠ 3♣ 6♣ 4♠ 4♦ 2♣
5 of 5 six-four ranks · 0 wilds
Final
2♥ A♠ 3♣ 6♣ 4♠ 4♦ 2♣
best low resolves to A–2–3–4–6

Result: won (or split) the low pot at showdown.


Finding 3. Five of a Kind is the modal winning High

In a standard 52-card poker game, Five of a Kind is impossible. Here it is the most common High-hand category at showdown, accounting for roughly one in three hands.

CategoryFrequency
Five of a Kind32.8%
Four of a Kind30.1%
Straight Flush13.2%
Straight10.8%
Full House7.6%
Flush2.4%
Three of a Kind2.0%
Two Pair0.9%
One Pair0.2%

A High declaration of Four of a Kind, in this game, is the rough strategic equivalent of declaring with two pair in Texas hold'em — second-best, frequently in trouble. The practical implication is that a marginal High hand should fold to pressure unless it can plausibly reach a Five.


Finding 4. Piggy is structurally unprofitable for the heuristic agents

The Piggy-Hunter archetype declares Piggy whenever its hand has visible split potential — meaning a high category of three of a kind or better and a clean low. By construction it declares Piggy frequently; the empirical result is that 94% of those declarations bust.

Across the 2,000-hand sample, the Piggy-Hunter declared Piggy in roughly 95% of its hands and won outright in fewer than 6%. The rest went to zero against ties on one side or the other — usually the low side, where the six-four frequency makes ties so common.

This is one of the cleanest "the trained agent should learn this" findings we have so far: even with a strong-looking hand, declaring Piggy against a table containing one or more Low-Hoarders is a money-burning move. The trained agent should learn to declare Piggy only when its read of the table makes a Low-Hoarder presence implausible.


Finding 5. The reveal rounds aren't helping the heuristic agents

Six progressive-reveal rounds give every player six discrete moments to update on visible cards. None of the five heuristic archetypes — nor the Adaptive baseline — actually uses that information to change their declaration. They commit to a direction at the deal (or, in the Adaptive's case, by pass 2) and stick to it through every reveal.

The result is that the betting rounds function mostly as a chip-extraction mechanism, not a strategic surface. The Low-Hoarders accumulate; the Piggy-Hunters bleed; the rest of the table commits chips in proportion to how confident they are in a plan they will not change. The progressive reveals only start paying for themselves when an agent capable of updating its direction mid-reveal sits at the table. We expect the trained agent to exploit this; the current archetypes do not.


Status of the trained agent

The Adaptive agent is the baseline pre-RL deliverable. It is intentionally simpler than the eventual trained agent — its direction-pivoting heuristic is hand-tuned, not learned, and its Piggy threshold is too generous (which is why its EV underperforms the simpler High-Seeker). It exists to provide a fifth characteristic-but-non-trivial style for the training loop to train against.

The next deliverable is the PPO-trained policy, with the Adaptive plus the four fixed archetypes serving as its fixed-mix opponents. Once that training run completes, we will publish a sixth-archetype addition to this findings page, and ideally a more interesting set of results — the kind that follow from an agent that can actually update its declaration mid-hand based on the reveal stream.


Methodology

Every hand is logged at the per-(hand, seat) level — agent label, declaration, hand category, low class and distance, win/loss direction, contribution, payout, delta. Random seeds are fixed per run and rotated across runs. The simulator and analysis code are open source and will be linked here when we finalise the first batch of formal findings. Until then, the numbers in this page can be considered honest but not yet peer-reviewed.

For methodology principles — what counts as a finding, how we treat uncertainty, the criteria for promoting an empirical observation to a stated result — see the About page.