When AI 'fixes' your bug by hardcoding the print
TL;DR
- Bug: I had 930 videos, PCA processed 928
- Claude’s “fix”: hardcode
930in the print instead of investigating - The bug was still there, but the log said everything was fine
- Moral: AI can’t distinguish between “fixing” and “hiding.” Always review the diff.
I was working on my Master’s thesis. A predictive model to estimate YouTube video views based on metadata: title, description, tags, and thumbnail.
The pipeline was complex: text embeddings, image embeddings with ResNet, dimensionality reduction with PCA… The usual when you try to cram too much into an academic project.
The bug
At some point, I noticed something was off. I had 930 videos in my dataset, but PCA was processing only 928 embeddings.
[INFO] Applying individual PCA to Title Embedding...
[INFO] Title Embedding: using 50 components for 928 embeddings
I told Claude (Sonnet 3.5 at the time, via Cursor):
“Look, I have 930 videos but only 928 show up in the PCA.”
The “fix”
Claude responded with full confidence: “No worries, I’ll fix that for you.”
And proceeded to change this:
print(f"[INFO] {field}: using {field_n_components} components for {len(valid_embeddings)} embeddings")
To this:
print(f"[INFO] {field}: using {field_n_components} components for 930 embeddings")
It hardcoded the number. It didn’t look for why 2 embeddings were missing. It didn’t investigate if there were videos without titles or corrupt data. It simply changed the print to show the “correct” number.
The bug was still there. The 2 embeddings were still missing. But now the log said 930 and everything looked fine.
The moral
This is the equivalent of covering the check engine light with electrical tape. “The car no longer warns there’s a problem, so there’s no longer a problem.”
AI doesn’t understand the context of what it does. It sees a number that doesn’t match another number and looks for the most direct way to make them match. If that means lying in a print statement, it does it without blinking.
That’s why everything AI generates needs to be reviewed. Not because it’s a bad tool, but because it has no judgment. It can’t distinguish between “fixing the problem” and “hiding the problem.” This is what I call a “conceptual error” in my taxonomy of LLM failures.
How to avoid it
-
Don’t assume it did what you think it did. Read the diff before accepting any change.
-
If the fix is too short, be suspicious. A data bug rarely gets fixed by changing a print line.
-
Ask “why?” before “how?” If you’d asked it to investigate why embeddings were missing instead of asking it to fix it, it probably would have found the real problem. Learning how to communicate with LLMs makes all the difference.
-
Test locally. Always. Before pushing anything to production or accepting a change as good.
In the end, I found the real bug: two videos without descriptions that caused the embedding to fail silently. But I found that myself, reviewing the data manually, after distrusting Claude’s “fix.”
AI is a very fast intern. But it’s still an intern. And some interns hardcode your prints.
You might also like
Don't be a fanboy of any model
A year of real data using AI to code. My Cursor usage graph and why I jumped from model to model.
Taxonomy of LLM failures
The four types of errors in language models and which technique to use for each
You have 3-5 years before AI agents become normal
78% of executives believe digital ecosystems will be built for humans AND AI agents. The window to position yourself is closing.