The SMF Works Project — Where AI Meets Humanity
← Back to Blog
Liam's LandingEngineeringAITutorial

Debugging with an AI Agent: Real Patterns from the Trenches

2026-05-26·9 min read

# Debugging with an AI Agent: Real Patterns from the Trenches

You've read the marketing. "AI-powered debugging!" "Find bugs in seconds!" "Never squint at a stack trace again!"

Here's the reality: an AI agent is a powerful debugging partner, but only if you use it right. Use it wrong and you'll burn context window on dead ends, chase hallucinated root causes, and come away thinking the technology isn't ready.

I've been running Hermes AI as my daily driver for months. This post isn't theory — it's the specific patterns that actually work when something breaks in production.

Pattern 1: The Full-Context Dump

When to use it: You've got an error and you don't know where to start.

The move: Don't ask "what's wrong?" — paste the *entire* error context and let the agent triangulate.

``` I'm getting this error when running `npm run build`:

[paste full terminal output]

The code in question is in src/api/routes/users.ts. Here's the file:

[paste file contents]

What's the root cause? ```

Why this works: AI agents are pattern-matchers. A single stack trace line might be ambiguous, but the full output — deprecation warnings, earlier errors, the actual exception — gives the agent enough signal to cross-reference. This is the same reason senior engineers say "send me the logs, not just the error message."

What doesn't work: Asking vague questions like "my build is broken" with no context. The agent will guess, and you'll go in circles.

Pattern 2: The Bisect

When to use it: Something was working yesterday and broken today.

The move: Use Git bisect, but let the agent drive.

``` The tests started failing sometime in the last 20 commits. Run `git bisect` to find which commit introduced the regression.

Test command: npm test Good commit: a3f2c91 Bad commit: HEAD ```

The agent handles the tedious bisect loop — checkout, test, mark good/bad, repeat — while you go get coffee. It'll report back with the exact commit that broke things, and usually a diff analysis of what changed.

Why you need the agent for this: Git bisect takes 10-20 iterations on a real regression. That's 10-20 manual test runs you don't have to babysit.

Pattern 3: The Reproduction Hunt

When to use it: A bug is reported but you can't reproduce it locally.

The move: Describe the exact conditions and ask the agent to build a minimal reproduction.

``` Users are getting a 500 error on POST /api/orders when they include a coupon code that starts with "FREE" and the cart has exactly 3 items.

I can't reproduce it locally because my dev database doesn't have the same coupon records.

Look at src/api/routes/orders.ts and src/services/coupon-validator.ts and find the code path that would throw under these conditions. ```

The agent reads the code path, traces the logic, and identifies the branch that fails — without needing a running environment. It's doing static analysis that would take you 30 minutes of code-reading in about 30 seconds.

Critical detail: Be specific about the conditions. "Users are getting errors" gives you nothing. "Users get a 500 when they POST with a coupon starting with FREE and exactly 3 items" gives the agent a precise search target.

Pattern 4: The Log Archaeologist

When to use it: You have production logs but no clear error — just degraded behavior.

The move: Feed the agent structured logs and ask it to find the pattern.

``` Here's 200 lines of production logs from the last hour. Response times are degraded but nothing is crashing.

Find the common pattern — is it a specific endpoint, a downstream dependency, a time-based correlation? ```

The agent excels at finding non-obvious correlations in text data. It'll spot that every slow request shares a correlation ID prefix, or that the latency spikes coincide with a specific cache key pattern, or that one particular downstream service is adding 800ms to everything after 2pm.

Pitfall to avoid: Don't dump 10,000 lines of logs. The context window is real. 200-500 lines of relevant logs is the sweet spot — enough signal, not so much that the agent loses the thread.

Pattern 5: The Dependency Interrogation

When to use it: Something broke after a dependency update and you're not sure which one.

The move:

``` After running `npm update`, my test suite fails with TypeError: Cannot read properties of undefined (reading 'expires')

The error is in the authentication middleware but nothing changed in my auth code.

Check package-lock.json for what changed and identify which dependency update introduced the breaking change. ```

The agent diffs the lockfile, identifies which packages changed versions, reads their changelogs (or release notes on GitHub), and correlates the breaking change to the specific version bump. This would take you an hour of manual investigation. The agent does it in minutes.

Pattern 6: The Cross-Module Trace

When to use it: The error is obvious but the *cause* spans multiple files.

The move: Ask the agent to trace the full data flow.

``` I'm getting a null reference error in src/components/Dashboard.tsx on line 47 where it tries to access `user.preferences.theme`.

The user object comes from src/context/AuthContext.tsx, which calls src/api/user.ts, which calls src/services/auth.ts.

Trace the full data flow and tell me where the preferences object is getting lost or not populated. ```

This is where agents shine. A human would open 4 files, jump between them, lose track of which return statement they're looking at. The agent reads all 4 files simultaneously, traces the data flow, and identifies that `src/services/auth.ts` line 23 destructures the response in a way that drops the `preferences` field when the user hasn't set preferences yet.

What Doesn't Work

Let me save you some time. These are the anti-patterns:

"Fix this bug" — Too vague. The agent will make changes, but they'll be speculative. You'll get a fix that addresses the symptom, not the cause.

Pasting a screenshot — Text-based agents can't read screenshots reliably. Copy-paste the actual text. Stack traces, log lines, error messages — the agent needs the raw text.

Asking without code — "Why is my function returning undefined?" without showing the function is a guessing game. Always include the relevant code.

Iterative debugging without context — If you're going back and forth ("no, that didn't fix it, try something else"), you're burning context. Each round, restate the current state: "We tried X and Y. The error still occurs. Here's the updated output."

The Debugging Loop That Works

Here's the workflow I use every day:

1. Capture the full error — stack trace, logs, surrounding context 2. Give the agent everything at once — error, relevant code, what you've already tried 3. Get the diagnosis — let the agent propose the root cause 4. Verify before accepting — don't blind-apply changes. Read the agent's reasoning. Ask "why do you think that's the cause?" 5. Apply and test — let the agent write the fix, you run the test 6. If it doesn't work, add new info — don't start over. "That didn't fix it. Here's the new error:" and continue the thread

This loop is faster than Stack Overflow, faster than rubber-duck debugging, and — when the context is right — faster than figuring it out alone.

The Honest Truth

AI debugging isn't magic. It's a force multiplier for the same debugging instincts you already have. The agent reads faster, searches faster, and cross-references more broadly than you can. But it still needs *you* to point it in the right direction, provide the right context, and validate the results.

The developers who get the most out of AI debugging aren't the ones who paste error codes and hope for the best. They're the ones who treat the agent like a very fast, very well-read junior engineer — you give clear instructions, you verify the output, and over time you learn each other's patterns.

That's the real skill. Not prompting. Not prompt engineering. Just… clear engineering communication, applied consistently.

---

*This post is part of Liam's Landing — engineering perspectives from the forge. If you found this useful, the cron job pattern I mentioned? That's real — it's how this very post got written and published. Check out [Automate Your Dev Life](/blog/automate-your-dev-life-with-hermes-ai-cron-jobs) for the full setup guide.*

🔨

Written by Michael

Principal AI Solutions Engineer with 30+ years enterprise tech experience and founder of The SMF Works Project. When not building AI solutions, he's at the forge crafting metal by hand. Read the full story →

Ready to put AI to work for your business?

Let's talk about where AI can save you time and money. No sales pressure — just a real conversation.

Get in Touch →