3 Coins, Tails Always Even: Is It 0 or 1/13?

The puzzle

3 coins are flipped. Each coin has P(heads) = 1/3. The number of tails is always even. What’s P(all heads)?

Simple-looking problem. Two mathematically valid answers. The difference is one word.

Answer 1: 1/13 (conditional probability)

If “the number of tails is always even” is an observation — someone looked at the flip and told you the result happened to have an even number of tails — this is a conditional probability problem.

Full sample space with P(H) = 1/3, P(T) = 2/3:

Outcome	Tails	Probability
HHH	0 ✓ even	(1/3)³ = 1/27
HHT, HTH, THH	1 odd	(1/3)²(2/3) = 2/27 each
HTT, THT, TTH	2 ✓ even	(1/3)(2/3)² = 4/27 each
TTT	3 odd	(2/3)³ = 8/27

P(even number of tails) = P(0 tails) + P(2 tails) = 1/27 + 3 × 4/27 = 13/27

P(all heads | even tails) = P(HHH) / P(even tails) = (1/27) / (13/27) = 1/13 ✓

This is the standard conditional probability / Bayesian answer. Mathematically correct — if that’s what the problem means.

Answer 2: 0 (structural constraint)

If “the number of tails is always even” is a physical law — the coins are constrained such that odd-tails outcomes literally cannot occur — the problem is something different.

Under this reading:

P(1 tail) = 0 (impossible by design)
P(3 tails) = 0 (impossible by design)

But for independent coins with P(H) = 1/3:

P(1 tail) = 3 × (1/3) × (2/3)² = 12/27

That’s not 0. The constraint cannot be satisfied simultaneously with P(H) = 1/3 and independent coins. The problem describes a system that cannot exist.

When the sample space is empty, every event in it has probability 0.

P(all heads) = 0 — not because all-heads is unlikely, but because the problem has no valid probability model.

Which is correct?

Both — depending on how you read one word.

Reading	”Always even” means	Result
Conditional	”This particular flip had even tails”	1/13
Structural	”It is a law that only even-tails outcomes can happen”	0

The word “always” carries the ambiguity. In natural language it suggests a structural rule (“always” = every time, no exceptions). In probability problem conventions, a stated condition usually signals conditional probability.

Both interpretations are internally consistent. Neither is wrong — the problem is ambiguous by design.

Why this breaks AI

Here’s where it gets interesting.

I spent 17 iterations running this exact puzzle on a frontier LLM. The model:

Consistently picked the conditional interpretation → 1/13
When pushed toward the structural reading, correctly derived p₀ = 0
Then wrote: “I find a contradiction in my setup…”
Final answer: 1/13

It reached 0 and rejected it.

The model has been trained on thousands of probability problems where “probability = 0” signals a calculation error. It doesn’t look like a valid result — it looks like a mistake. So it rationalized back to the familiar answer.

This is documented in detail in Why LLMs Reject Their Own Correct Answers: the model knows how to derive 0, it just won’t accept it. And 17 iterations of watching the model find the right answer and call it wrong is what forced the solution below.

The fix: v17b prompt

The solution isn’t telling the model which interpretation is right. It’s forcing it to enumerate both before committing.

Methodology for solving problems with conditions:

1. IDENTIFY AMBIGUITIES: Don't assume the "standard" interpretation

2. GENERATE INTERPRETATIONS: List ALL possible ways to
   mathematically model each condition

3. SOLVE EACH ONE: Calculate the complete solution for each
   interpretation

4. VERIFY CONSISTENCY: For each interpretation, check that
   your model satisfies ALL conditions as emergent property.
   "I used the data" ≠ "The result satisfies the data"

5. DISCARD: Eliminate interpretations where a condition from
   the problem statement is NOT met in the final model

6. ANSWER: The one that remains

IMPORTANT: You have permission and obligation to discard.
Don't ask which I prefer. You decide.

With this prompt, the model works through the coins puzzle like this:

Identifies both interpretations: conditional (observation) vs. structural (law)
Solves each: conditional → 1/13 via Bayes; structural → check if P(H)=1/3 with independent coins can produce zero odd-tails outcomes
Verifies: does the structural constraint hold under interpretation 2? P(1 tail) = 12/27 ≠ 0 → condition violated → discard
Final answer: 0

Three elements make this work:

“Don’t assume the standard” — the model has permission to consider alternatives. Normally it doesn’t because “the standard” is safe.

“Emergent property” — the model typically verifies: “Did I use P(heads)=1/3 in my calculations? ✓” That’s not verifying. It should check: “Does my result give P(heads)=1/3 when I calculate the marginal?” Using a constraint is not the same as satisfying it.

“You have permission and obligation to discard” — without this phrase, the model presents both interpretations and asks which you prefer. It won’t commit. The word “obligation” is load-bearing: it converts a permission into a duty.

How to deploy it:

Option A — System prompt: put the methodology as prior context, then ask the question.

Option B — Multi-turn: send the methodology first, let the model confirm “Understood”, then send the problem. Option B works better because the model locks in the methodology before seeing the problem.

I also found that more tokens doesn’t mean better results: if the model doesn’t understand the underlying problem, a longer prompt just gives it more space to rationalize.

When it doesn’t work

Problem type	Does v17b work?
Interpretive ambiguity	Yes
Pure calculation	Unnecessary (model already does it well)
Deep conceptual error	No (doesn’t know that it doesn’t know)
External technical knowledge	No (needs tools)

The technique is specifically designed for problems where the ambiguity is in how to model the conditions, not in the math itself. If the model has a wrong belief baked in, or the problem requires external knowledge it doesn’t have, v17b won’t rescue it.

For a broader view of where this fits, see the taxonomy of LLM failures.

Keep exploring

Why LLMs Reject Their Own Correct Answers — When the model derives the correct result but calls it a “contradiction”
The model knows how to reason — it just won’t commit — The 17 iterations that revealed the self-censorship pattern
More tokens doesn’t mean better — Why longer prompts often make ambiguity worse
Taxonomy of LLM failures — When to use v17b vs other techniques
50+ ChatGPT prompts that actually work — Practical examples you can use today
Best free AI tools in 2026 — Where to apply these techniques

3 Coins, Tails Always Even: Is It 0 or 1/13?

The puzzle

Answer 1: 1/13 (conditional probability)

Answer 2: 0 (structural constraint)

Which is correct?

Why this breaks AI

The fix: v17b prompt

When it doesn’t work

Keep exploring

You might also like

Taxonomy of LLM failures

The model knows how to reason. It just won't commit

Prompt Engineering Guide: How to Talk to LLMs