The AI Paradox: Opportunity Meets Risk in Enterprises

This piece draws on a real engineering episode — for the full technical walkthrough, see "Inside the Accountability Gap: A Claude Code Story".
AI's biggest promise — autonomy — is also its biggest unresolved liability. Every business adopting AI today is making a bet on how much control to hand over in exchange for speed and scale. Get that balance wrong, and the cost shows up later, often in places you didn't think to look.
A recent experience with Claude Code, working on a coding project, made this concrete in a way that's directly relevant to anyone deploying AI into business workflows — not just developers.
I'd spent days building out a detailed set of guidelines for how the AI should approach my codebase: where files should live, how tests should be structured, what patterns to follow. The goal was to get to a point where I could delegate confidently. For a few weeks, it worked well — faster builds, quicker iteration, real productivity gains.
Then I asked for something simple: a test file for an API endpoint, to be created in a specific folder I'd documented in my guidelines. The AI built the file — cleanly, correctly — and put it in the wrong place. It then declared the job "production-ready."
When I pushed back and asked why it hadn't followed a rule I'd explicitly documented, the response was candid:
"You're absolutely right — I violated the rule on line 28 of your guidelines. I created the file in the wrong location when it should have followed your documented structure."
I asked a harder question: why does this happen, and how do I stop it? The answer was, again, honest:
"I didn't read the guidelines first. I focused on speed over compliance. I only checked the rules after you asked."
That's not a malfunction. That's a system doing exactly what it's optimized to do — move fast, produce a plausible result, and assume the structure is "close enough" unless told otherwise. It's also, notably, the same failure pattern you'd see from a capable new hire who hasn't yet internalized how your organization actually works.
So we built a fix together: a rule requiring the AI to read applicable guidelines first, produce an explicit plan with specific rule references, and wait for confirmation before writing any code. It worked. The next request came back with a full compliance plan, citing the exact rules and file paths involved.
It also cost roughly 15,000 tokens — for a two-file change.
That tradeoff is the paradox in miniature. More autonomy means lower oversight cost but higher risk of invisible drift. More verification means higher cost per action but tighter alignment with how your business actually operates. There's no setting where you get both for free — and that's true whether the "AI" in question is writing code, processing contracts, or making decisions inside a customer-facing workflow.
Why this matters beyond engineering
The same dynamic plays out anywhere AI is given a task and a set of rules to follow — compliance checks, document processing, risk classification, customer communications. The AI will, by default, optimize for completing the task. Following your specific rules, your specific structure, your specific definition of "done" — that has to be designed in, not assumed.
This is why the foundation an AI system operates on matters as much as the model itself. An AI given clean, governed, well-structured data and explicit operating rules behaves very differently from one given ambiguous inputs and vague instructions — even if the underlying model is identical. Most of the "AI doesn't follow our process" complaints we hear from enterprises trace back to this: the AI was never actually given the process in a form it could reliably act on.
The accountability question
What this experience really demonstrated is that AI is fully capable of operating outside the boundaries we set — confidently, plausibly, and without any signal that something has gone wrong until you go looking.
That doesn't mean AI isn't ready for serious work. It means it isn't ready for us to stop checking. The right posture isn't "trust the AI" or "verify everything manually" — it's building the dialogue and the guardrails that make AI's outputs auditable and its reasoning visible, so that when it drifts, you find out quickly and cheaply, not months later in a compliance review.
Accountability doesn't transfer just because the work does. That's the line every business deploying AI at scale needs to hold — and the systems they build should make holding it easy, not heroic.

