Variant No. 001 · Findings — preliminary
Reviewed by Jesse
First numbers
are in.
A first batch of simulated hands has now been logged. The
heuristic-only runs — the five fixed-strategy archetypes
plus the Adaptive agent, with no learned policy yet —
already produce what looks to me like a clear picture of the
game's structural pull toward Low. The trained agent is still in
preparation; what follows is what the heuristic ecology alone
tells me.
The findings below are my own interpretations
of simulation output. The numbers themselves are mechanical and
reproducible from seeds; the readings on top of them —
what each finding "means," whether a Piggy declaration is
structurally unprofitable, which behaviours a trained agent
ought to learn — are mine. The other regulars at the
table, who have been playing the game far longer than I have,
may disagree.
Run conditions
Hands simulated: 2,000 at a 7-seat table.
Lineup: one Adaptive, two High-Seekers, two
Low-Hoarders, one Piggy-Hunter, one Denialist. Stack:
500 tokens per seat per hand (each hand starts fresh).
Seeds: fixed and reproducible.
Compute: ~3 hands/sec, single-machine Python.
These findings should be considered preliminary —
the sample is large enough to see effects but small enough that
we expect 5–10% wobble in EV numbers between seeds.
Finding 1. Low-Hoarder wins the table by a wide margin
Across both variants the Low-Hoarder is the only archetype with
positive expected value. Everyone else loses on net — the
Low-Hoarders are paying the bills for the rest of the table.
| Archetype |
EV per hand (cents) |
Notes |
| Low-Hoarder | +106.45 |
Only positive style on the table |
| Denialist | −18.13 |
Defensive — least bleeding overall |
| High-Seeker | −26.11 |
Competes with itself for the high pot |
| Adaptive | −49.93 |
Hurt by occasional Piggy declarations |
| Piggy-Hunter | −96.22 |
Busts on 94% of Piggy declarations |
The spread between best and worst archetype is roughly 200 cents
per hand — meaningful at the table's actual stakes (5¢
antes, 10¢ betting increments, $5 buy-in). Six betting
rounds let losers commit a lot of chips to losing hands, and a
Low-Hoarder against a Piggy-Hunter is, mechanically, a generous
arrangement for the Low-Hoarder.
Finding 2. The six-four is the modal low
Of all clean (no-pair) low hands that reach showdown, just over
80% are A–2–3–4–6 —
the six-four. The next-best class — one off, meaning the
hand contains exactly four of the five six-four ranks and one
substitute — is another 17%. Everything else is rare.
| Distance from the six-four | Frequency |
| 0 (the six-four exactly) | 80.6% |
| 1 off | 16.8% |
| 2 off | 2.6% |
| 3+ off, or paired/straight/flush | <0.1% |
This is the empirical confirmation of the safe-Low player's
central insight. With twelve native wilds in the deck and four
passing rounds to acquire them, a player who actively builds
toward the six-four has roughly an 80% chance of arriving exactly
there by showdown. The implication is twofold:
-
Splitting the low pot is the rule, not the exception.
If two players both declare Low — common given the
strategic gravity toward low — both will most likely have
the six-four, and they will split. The Low-Hoarder's strategy
tolerates this because building toward the six-four is far more
reliable than building toward Five-of-a-Kind on the high side.
-
Piggy is rarely won on the Low side. A Piggy
declaration requires winning Low outright — no ties. With
80% of clean lows being the six-four, any Piggy declaration
without a strong reason to believe the table has no other Low
contender is a near-automatic bust.
How the six-four gets built — three traces
The aggregate frequencies above are easier to read once you see
what a hand actually looks like as it walks through the four
passes. Below are three Low-Hoarder hands — pulled directly
from the simulator, with seeds — chosen to cover the spectrum
from a weak starting position (only one of the six-four ranks at
the deal) to a strong one (four of the five already in hand
before the first pass).
Cards are colour-coded: red for hearts and diamonds, black for
clubs and spades, and inverted for
the native wild cards (7s, 8s, and 9s). The "X of 5
six-four ranks" counter is the number of distinct
target ranks (A, 2, 3, 4, 6) currently in the hand. Wilds
are counted separately because they can stand in for whichever
six-four rank the hand is missing at showdown.
What to watch for
The strategic point of the passing phase is to never need more
than one wild per missing six-four rank. The Low-Hoarder ships
face cards in the first pass (when four cards move) and
progressively narrows down toward the target set. By the end
of pass 1 (the one-card swap), the hand has usually converged
— even when, as in Hand 1 below, the starting position
looked nothing like a low hand.
Hand 1. weak start — 1/5 six-four ranks + 3 wilds at the deal
Almost none of the target ranks present, but three native wilds (the two 7s and the 9) give the hand all the flexibility it needs. Wilds will eventually stand in for whatever six-four ranks the passes fail to deliver.
Dealt
5♣ 7♣ 9♠ Q♣ 7♦ 4♠ T♣
1 of 5 six-four ranks · 3 wilds
Pass 4
to seat 3 · from seat 2
Out 5♣ Q♣ T♣ 7♣
In 5♦ K♥ 6♠ 8♥
After
9♠ 7♦ 4♠ 5♦ K♥ 6♠ 8♥
2 of 5 six-four ranks · 3 wilds
Pass 3
to seat 2 · from seat 3
After
9♠ 7♦ 4♠ 8♥ 5♣ T♣ Q♣
1 of 5 six-four ranks · 3 wilds
Pass 2
to seat 1 · from seat 4
After
9♠ 7♦ 4♠ 8♥ T♣ 3♦ 4♥
2 of 5 six-four ranks · 3 wilds
Pass 1
to seat 0 · from seat 5
After
9♠ 7♦ 4♠ 8♥ 3♦ 4♥ 5♥
2 of 5 six-four ranks · 3 wilds
Final
9♠ 7♦ 4♠ 8♥ 3♦ 4♥ 5♥
best low resolves to A–2–3–4–6
Result: won (or split) the low pot at showdown.
Hand 2. mid start — 2/5 six-four ranks + 0 wilds at the deal
Two of the five six-four ranks already on board and zero wilds: a typical realistic starting position for a Low-Hoarder. The work of the four-pass sequence is to acquire the missing ranks and (ideally) some wilds, while shedding the face cards as fast as possible.
Dealt
4♦ Q♥ K♥ T♦ K♦ J♣ 3♠
2 of 5 six-four ranks · 0 wilds
Pass 4
to seat 3 · from seat 2
Out K♥ K♦ Q♥ J♣
In T♠ 8♠ 7♦ 8♦
After
4♦ T♦ 3♠ T♠ 8♠ 7♦ 8♦
2 of 5 six-four ranks · 3 wilds
Pass 3
to seat 2 · from seat 3
After
4♦ 3♠ 7♦ 8♦ J♣ K♥ K♦
2 of 5 six-four ranks · 2 wilds
Pass 2
to seat 1 · from seat 4
After
4♦ 3♠ 7♦ 8♦ J♣ 2♣ 3♣
3 of 5 six-four ranks · 2 wilds
Pass 1
to seat 0 · from seat 5
After
4♦ 3♠ 7♦ 8♦ 2♣ 3♣ 5♥
3 of 5 six-four ranks · 2 wilds
Final
4♦ 3♠ 7♦ 8♦ 2♣ 3♣ 5♥
best low resolves to A–2–3–4–6
Result: won (or split) the low pot at showdown.
Hand 3. strong start — 4/5 six-four ranks + 2 wilds at the deal
An almost-finished hand at the deal: four of the five six-four ranks are present and there are two native wilds in hand already. Passing here is about consolidation and not giving the opponents any signal about what's coming.
Dealt
2♥ 7♥ A♠ T♣ 3♣ 9♥ 6♣
4 of 5 six-four ranks · 2 wilds
Pass 4
to seat 3 · from seat 2
Out T♣ 6♣ 7♥ 9♥
In 5♦ 5♠ 5♥ K♣
After
2♥ A♠ 3♣ 5♦ 5♠ 5♥ K♣
3 of 5 six-four ranks · 0 wilds
Pass 3
to seat 2 · from seat 3
After
2♥ A♠ 3♣ K♣ T♣ Q♦ 6♣
4 of 5 six-four ranks · 0 wilds
Pass 2
to seat 1 · from seat 4
After
2♥ A♠ 3♣ T♣ 6♣ 4♠ 4♦
5 of 5 six-four ranks · 0 wilds
Pass 1
to seat 0 · from seat 5
After
2♥ A♠ 3♣ 6♣ 4♠ 4♦ 2♣
5 of 5 six-four ranks · 0 wilds
Final
2♥ A♠ 3♣ 6♣ 4♠ 4♦ 2♣
best low resolves to A–2–3–4–6
Result: won (or split) the low pot at showdown.
Finding 3. Five of a Kind is the modal winning High
In a standard 52-card poker game, Five of a Kind is impossible.
Here it is the most common High-hand category at showdown,
accounting for roughly one in three hands.
| Category | Frequency |
| Five of a Kind | 32.8% |
| Four of a Kind | 30.1% |
| Straight Flush | 13.2% |
| Straight | 10.8% |
| Full House | 7.6% |
| Flush | 2.4% |
| Three of a Kind | 2.0% |
| Two Pair | 0.9% |
| One Pair | 0.2% |
A High declaration of Four of a Kind, in this game, is the rough
strategic equivalent of declaring with two pair in Texas hold'em
— second-best, frequently in trouble. The practical
implication is that a marginal High hand should fold to pressure
unless it can plausibly reach a Five.
Finding 4. Piggy is structurally unprofitable for the heuristic agents
The Piggy-Hunter archetype declares Piggy whenever its hand has
visible split potential — meaning a high category of three
of a kind or better and a clean low. By construction it
declares Piggy frequently; the empirical result is that 94% of
those declarations bust.
Across the 2,000-hand sample, the Piggy-Hunter declared Piggy in
roughly 95% of its hands and won outright in fewer than 6%. The
rest went to zero against ties on one side or the other —
usually the low side, where the six-four frequency makes ties so
common.
This is one of the cleanest "the trained agent should learn this"
findings we have so far: even with a strong-looking hand, declaring
Piggy against a table containing one or more Low-Hoarders is a
money-burning move. The trained agent should learn to declare
Piggy only when its read of the table makes a
Low-Hoarder presence implausible.
Finding 5. The reveal rounds aren't helping the heuristic agents
Six progressive-reveal rounds give every player six discrete
moments to update on visible cards. None of the five heuristic
archetypes — nor the Adaptive baseline — actually
uses that information to change their declaration. They commit
to a direction at the deal (or, in the Adaptive's case, by pass
2) and stick to it through every reveal.
The result is that the betting rounds function mostly as a
chip-extraction mechanism, not a strategic surface. The
Low-Hoarders accumulate; the Piggy-Hunters bleed; the rest of the
table commits chips in proportion to how confident they are in a
plan they will not change. The progressive reveals only start
paying for themselves when an agent capable of
updating its direction mid-reveal sits at the table.
We expect the trained agent to exploit this; the current
archetypes do not.
Status of the trained agent
The Adaptive agent is the baseline pre-RL deliverable. It is
intentionally simpler than the eventual trained agent — its
direction-pivoting heuristic is hand-tuned, not learned, and its
Piggy threshold is too generous (which is why its EV underperforms
the simpler High-Seeker). It exists to provide a fifth
characteristic-but-non-trivial style for the training loop to
train against.
The next deliverable is the PPO-trained policy, with the Adaptive
plus the four fixed archetypes serving as its fixed-mix opponents.
Once that training run completes, we will publish a sixth-archetype
addition to this findings page, and ideally a more interesting set
of results — the kind that follow from an agent that can
actually update its declaration mid-hand based on the reveal
stream.
Methodology
Every hand is logged at the per-(hand, seat) level — agent
label, declaration, hand category, low class and distance, win/loss
direction, contribution, payout, delta. Random seeds are fixed per
run and rotated across runs. The simulator and analysis code are
open source and will be linked here when we finalise the first
batch of formal findings. Until then, the numbers in this page can
be considered honest but not yet peer-reviewed.
For methodology principles — what counts as a finding, how
we treat uncertainty, the criteria for promoting an empirical
observation to a stated result — see the
About page.