Intent Engineering: What Klarna's $60 Million Mistake Teaches Us About AI Leadership

In February 2025, Klarna CEO Sebastian Siemiatkowski told Bloomberg: "I believe AI can already do all the jobs that we as humans do." Three months later, in May 2025, the same CEO told the same outlet: "Cost was unfortunately too dominant a factor in our evaluation. The result is lower quality" (1).

Between those two statements: 2,000 laid-off employees, 2.3 million automated customer conversations per month, resolution time cut from 11 minutes to 2 minutes — and a company now hiring humans again. At the same time, IBM published a survey of 2,000 CEOs: only one in four AI projects delivers the promised return on investment (2).

This is not a story about bad technology. It is a story about a missing layer in how organizations deploy AI. And that layer has a name.

The Five Levels Google Defines

In November 2025, Google Cloud published a 54-page white paper titled "Introduction to Agents" (3). The document establishes a taxonomy for AI agent systems in five levels:

Level 0: An isolated language model with no external tools or memory. Level 1: A connected problem-solver that can access external tools. Level 2: A strategic problem-solver that can plan and execute complex goals. Level 3: A collaborative multi-agent system where specialized agents work together. Level 4: A self-evolving system that autonomously creates new tools and agents.

Google describes three core components of every agent: the model (the "brain"), the tools (the "hands"), and the orchestration layer (the "nervous system") (3). That is the technical architecture. And that is precisely where the problem lies.

What's Missing Between Context and Outcome

Google describes WHAT an agent should know (context) and HOW it should act (tools, orchestration). What is missing is WHY it acts.

Klarna illustrates this blind spot perfectly. The AI chatbot had context (customer inquiries, product data, FAQ databases). It had tools (ticketing system, payment processing, refunds). It had orchestration (workflow rules, escalation paths). What it did not have: any understanding of whether the goal was "close tickets fast" or "build lasting customer relationships."

A human service representative with five years of experience knows this difference intuitively. They have absorbed the organization's real values through hundreds of conversations, through feedback, through observing which behaviors get rewarded and which do not. The AI agent had none of that. It had metrics — and metrics are not intent.

The result: Klarna's chatbot resolved issues in 2 minutes instead of 11. And customers complained about generic, robotic responses. The metric looked good. The outcome was poor. CEO Siemiatkowski had to admit: "Really investing in the quality of human support is the way of the future for us" (1).

Intent Engineering: The Missing Layer

The term "intent engineering" describes a layer that neither Google's taxonomy nor Klarna's implementation addresses: the machine-readable formulation of organizational intent.

Context engineering tells an agent what it should know. Prompt engineering tells it how to respond. Intent engineering tells it what it should want. What its values are. Which trade-offs it may make and which it may not. Where the hard limits are. Which signals mean "satisfied customer" and which mean "closed fast, resolved wrong."

Without intent engineering, every agent optimizes for the easiest available metric. At Klarna, that was cost. At most organizations, it is speed, volume, or error rate. These metrics are measurable. They are also dangerous, because they do not reflect the organization's actual intent.

Google's own "AI Agent Trends 2026" report describes the shift from "instruction-based computing" to "intent-based computing" (4). What the report does not answer: how an organization formulates its intent when it has never explicitly clarified that intent itself.

Why Organizations Don't Clarify Their Intent

This is where it gets psychological.

Steven Hayes, the founder of Acceptance and Commitment Therapy (ACT), defined a phenomenon in 1996 he called "experiential avoidance": the attempt to alter the form, frequency, or intensity of unwanted internal experiences such as thoughts, feelings, and bodily sensations — even when that avoidance behavior is costly, ineffective, or unnecessary (5).

Bond and Flaxman showed in 2006, in a longitudinal study of 448 call-center employees, that psychological flexibility (the opposite of experiential avoidance) predicts job performance, the ability to learn new systems, and mental health at work (6). Bond, Flaxman, and Bunce replicated the finding in 2008 in an intervention study: employees with higher psychological flexibility benefited more from organizational change, showed better mental health, and had lower absenteeism (7).

The pattern transfers directly to the C-suite. A CEO who deploys AI without clarifying organizational intent is typically avoiding an uncomfortable question: "What is the actual purpose of this department, beyond the metrics we've measured so far?"

That question is uncomfortable because it produces answers that are more expensive than cost reduction. "Building lasting customer relationships" costs more than "closing tickets fast." "Empowering employees to lead AI as a tool" takes longer than "replacing employees with AI."

Avoiding this question is experiential avoidance at the organizational level. And it has a measurable price.

The Measurable Cost of Avoidance

Fløvik, Knardahl, and Christensen examined the psychological effects of organizational change in a prospective study of 7,985 employees over two years in 2019. The results: downsizing increased the risk of clinically relevant psychological distress by 51% (OR 1.51). Repeated organizational changes increased the risk by 84% (OR 1.84). Measured with the HSCL-10 at a cut-off of 1.85 marking clinically relevant distress (8).

AI deployment is the fastest, most invasive, and most frequently repeated organizational change the working world has ever seen. Backhaus and colleagues confirmed in 2024 that the introduction of new technologies produces industry-specific psychological effects. In manufacturing, fear of job loss dominates. In service sectors, competence anxiety and technostress dominate (9).

The good news: Archer and colleagues showed in 2024, in an organization-wide field study, that ACT-based training increases psychological flexibility, reduces burnout, and enhances stress resilience. The most exhausted employees benefited most (10).

The evidence is clear: Organizational change without psychological support produces clinical diagnoses. And AI deployment without formulated intent is the sharpest form of that change.

What Formulated Intent Changes

Imagine Klarna's chatbot had, instead of a resolution-time metric, a document defining: "If a customer hangs up frustrated, we have failed — regardless of how quickly the ticket was closed." One clear sentence. No room for interpretation. The system now knows that speed is not an end in itself.

In modern agent systems, there is already a convention for this. Developers call the file that defines an AI system's values, purpose, and boundaries a "SOUL.md." It contains no prompts and no technical instructions. It contains what a human employee learns in five years of tenure through osmosis: What do we stand for? Which trade-offs do we accept and which do we not? What matters more when efficiency and quality come into conflict?

Imagine every agent also had a functional profile: a finance agent prioritizes risk and directness. An HR agent prioritizes empathy and relationship. Not as personality simulation, but as a definition that determines which information is prioritized and which trade-offs are acceptable. No agent decides alone. Each produces structured decision briefs. The human decides.

And here lies the real problem. Writing a SOUL.md requires someone who knows the organization's intent. Who writes that document in most companies? The IT department. Who knows the organization's intent? At best, the C-suite. And the C-suite usually doesn't know it explicitly enough to write it down.

That is the gap. IT deploys agents with technically sound architecture — Google's levels 1 through 4, perfectly executed. But IT has no access to what belongs in the SOUL.md. And the C-suite doesn't know it needs a SOUL.md. The result: agents without intent. Klarna's result.

Applied to the organization: an AI strategy based on avoidance (cut costs, without clarifying what for) produces Klarna's result. An AI strategy based on values clarification (formulate what the purpose is, and align the AI to it) produces systems that carry the organization's intent.

The Practical Question for You

Google's taxonomy gives you the technical levels. The research on experiential avoidance gives you the psychological explanation. Klarna's story gives you the warning.

The question that remains: who in your organization writes the SOUL.md?

If the answer is "nobody" or "IT," your AI agents are optimizing for what is measurable. And what is measurable is rarely what counts.

Bond, Hayes, and Barnes-Holmes put it this way in 2006: psychological flexibility in an organizational context means staying in contact with one's own values even when the situation produces uncomfortable feelings, and aligning behavior with those values rather than with the avoidance of discomfort (12).

The technology for intent engineering exists. The psychological research explaining why organizations don't clarify their intent has existed for 30 years. What is missing is the willingness to ask the uncomfortable question: What do we stand for when cost and quality come into conflict?

Intent Engineering: What Klarna's $60 Million Mistake Teaches Us About AI Leadership

The Five Levels Google Defines

What's Missing Between Context and Outcome

Intent Engineering: The Missing Layer

Why Organizations Don't Clarify Their Intent

The Measurable Cost of Avoidance

What Formulated Intent Changes

The Practical Question for You

Further Reading

More on this topic

Related Articles

We Learn Distrust and Unlearn Trust

Are You Strong Enough for Your Child to Question You?

AI Governance Is Parenting: Why Engineers Alone Cannot Steer Your AI

External Links

Related Articles

https://www.researchgate.net/publication/228652600

https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-024-19815-w

https://www.sciencedirect.com/science/article/pii/S2212144724000796

https://www.researchgate.net/publication/27224123