Skip to content

3 Coins, Tails Always Even: Is It 0 or 1/13?

· Updated · 6 min read · Read in Español
Share:

The puzzle

3 coins are flipped. Each coin has P(heads) = 1/3. The number of tails is always even. What’s P(all heads)?

Simple-looking problem. Two mathematically valid answers. The difference is one word.


Answer 1: 1/13 (conditional probability)

If “the number of tails is always even” is an observation — someone looked at the flip and told you the result happened to have an even number of tails — this is a conditional probability problem.

Full sample space with P(H) = 1/3, P(T) = 2/3:

OutcomeTailsProbability
HHH0 ✓ even(1/3)³ = 1/27
HHT, HTH, THH1 odd(1/3)²(2/3) = 2/27 each
HTT, THT, TTH2 ✓ even(1/3)(2/3)² = 4/27 each
TTT3 odd(2/3)³ = 8/27

P(even number of tails) = P(0 tails) + P(2 tails) = 1/27 + 3 × 4/27 = 13/27

P(all heads | even tails) = P(HHH) / P(even tails) = (1/27) / (13/27) = 1/13

This is the standard conditional probability / Bayesian answer. Mathematically correct — if that’s what the problem means.


Answer 2: 0 (structural constraint)

If “the number of tails is always even” is a physical law — the coins are constrained such that odd-tails outcomes literally cannot occur — the problem is something different.

Under this reading:

  • P(1 tail) = 0 (impossible by design)
  • P(3 tails) = 0 (impossible by design)

But for independent coins with P(H) = 1/3:

P(1 tail) = 3 × (1/3) × (2/3)² = 12/27

That’s not 0. The constraint cannot be satisfied simultaneously with P(H) = 1/3 and independent coins. The problem describes a system that cannot exist.

When the sample space is empty, every event in it has probability 0.

P(all heads) = 0 — not because all-heads is unlikely, but because the problem has no valid probability model.


Which is correct?

Both — depending on how you read one word.

Reading”Always even” meansResult
Conditional”This particular flip had even tails”1/13
Structural”It is a law that only even-tails outcomes can happen”0

The word “always” carries the ambiguity. In natural language it suggests a structural rule (“always” = every time, no exceptions). In probability problem conventions, a stated condition usually signals conditional probability.

Both interpretations are internally consistent. Neither is wrong — the problem is ambiguous by design.


Why this breaks AI

Here’s where it gets interesting.

I spent 17 iterations running this exact puzzle on a frontier LLM. The model:

  1. Consistently picked the conditional interpretation → 1/13
  2. When pushed toward the structural reading, correctly derived p₀ = 0
  3. Then wrote: “I find a contradiction in my setup…”
  4. Final answer: 1/13

It reached 0 and rejected it.

The model has been trained on thousands of probability problems where “probability = 0” signals a calculation error. It doesn’t look like a valid result — it looks like a mistake. So it rationalized back to the familiar answer.

This is documented in detail in Why LLMs Reject Their Own Correct Answers: the model knows how to derive 0, it just won’t accept it. And 17 iterations of watching the model find the right answer and call it wrong is what forced the solution below.


The fix: v17b prompt

The solution isn’t telling the model which interpretation is right. It’s forcing it to enumerate both before committing.

Methodology for solving problems with conditions:

1. IDENTIFY AMBIGUITIES: Don't assume the "standard" interpretation

2. GENERATE INTERPRETATIONS: List ALL possible ways to
   mathematically model each condition

3. SOLVE EACH ONE: Calculate the complete solution for each
   interpretation

4. VERIFY CONSISTENCY: For each interpretation, check that
   your model satisfies ALL conditions as emergent property.
   "I used the data" ≠ "The result satisfies the data"

5. DISCARD: Eliminate interpretations where a condition from
   the problem statement is NOT met in the final model

6. ANSWER: The one that remains

IMPORTANT: You have permission and obligation to discard.
Don't ask which I prefer. You decide.

With this prompt, the model works through the coins puzzle like this:

  1. Identifies both interpretations: conditional (observation) vs. structural (law)
  2. Solves each: conditional → 1/13 via Bayes; structural → check if P(H)=1/3 with independent coins can produce zero odd-tails outcomes
  3. Verifies: does the structural constraint hold under interpretation 2? P(1 tail) = 12/27 ≠ 0 → condition violated → discard
  4. Final answer: 0

Three elements make this work:

“Don’t assume the standard” — the model has permission to consider alternatives. Normally it doesn’t because “the standard” is safe.

“Emergent property” — the model typically verifies: “Did I use P(heads)=1/3 in my calculations? ✓” That’s not verifying. It should check: “Does my result give P(heads)=1/3 when I calculate the marginal?” Using a constraint is not the same as satisfying it.

“You have permission and obligation to discard” — without this phrase, the model presents both interpretations and asks which you prefer. It won’t commit. The word “obligation” is load-bearing: it converts a permission into a duty.

How to deploy it:

Option A — System prompt: put the methodology as prior context, then ask the question.

Option B — Multi-turn: send the methodology first, let the model confirm “Understood”, then send the problem. Option B works better because the model locks in the methodology before seeing the problem.

I also found that more tokens doesn’t mean better results: if the model doesn’t understand the underlying problem, a longer prompt just gives it more space to rationalize.


When it doesn’t work

Problem typeDoes v17b work?
Interpretive ambiguityYes
Pure calculationUnnecessary (model already does it well)
Deep conceptual errorNo (doesn’t know that it doesn’t know)
External technical knowledgeNo (needs tools)

The technique is specifically designed for problems where the ambiguity is in how to model the conditions, not in the math itself. If the model has a wrong belief baked in, or the problem requires external knowledge it doesn’t have, v17b won’t rescue it.

For a broader view of where this fits, see the taxonomy of LLM failures.


Keep exploring

Found this useful? Share it

Share:

Related course

Learn AI Development Master with real practice

Step-by-step modules, hands-on exercises and real projects. No fluff.

See course →

Consulting

Got a similar problem with AI Integrations?

I can help. Tell me what you're dealing with and I'll give you an honest diagnosis — no commitment.

See consulting →

You might also like