Thinking · AI · Process Design

Designing with AI

A framework for human-led, agent-assisted product design.
On process architecture, genuine checkpoints, drift, and what AI augmentation actually demands.

Design Operations · IKEA / Ingka Group · 2026 · 18 min read

Topic · Design Operations

Context · IKEA / Ingka Group

Year · 2026

Read · 18 min

← Case study ← All thinking

Preface

The question dominating design teams right now is the wrong one. "Which AI tool should we use?" is not a design question — it's a procurement question. Answering it before the more fundamental question will produce exactly what most teams are experiencing: impressive demos, inconsistent outputs, and workflows that are harder to audit than the ones they replaced.

The more interesting question is: if you were designing a workflow from scratch, knowing that AI agents would be part of it — how would you structure the process? Where would the humans be? What would they actually do? What would agents handle? And how would you know if it was working?

That is the question this framework tries to answer. It grew out of real work inside IKEA's digital experience design team, and was extended afterward into something that can operate independently of any specific toolset. What follows is the full argument — including the parts that failed before they worked.

Process design

The reframe

Most teams treat AI as a feature — something bolted onto an existing workflow. The Figma plugin that generates copy variants. The ChatGPT tab that summarises research notes. The assistant that writes the first draft. There is nothing wrong with any of these uses. But they share a common limitation: they inherit whatever structural problems already exist in the workflow.

If the process is ad hoc, AI makes it faster and more ad hoc. If decisions are poorly documented, AI-generated documentation will be confidently wrong. If the team lacks shared understanding of what they're trying to achieve, AI will optimise for something — but not necessarily for that.

Adding intelligence to an unclear process produces something worse than a slow, unclear process. It produces a fast, unclear one that looks credible.

Adding intelligence to an unclear process produces something worse than a slow, unclear process. It produces a fast, unclear one that looks credible.

The reframe: treat AI the same way you'd treat any other systems design problem. Define the inputs. Define the outputs. Define the handoffs. Define who decides what and when. Design the workflow first, then design AI into it.

This shifts the question from "what can AI do?" to "where does AI fit in this process, and how do we know it's working?" The second question is harder. It is also the one worth asking.

Systems architecture

The architecture

The framework organises AI-assisted design work into three tiers of agents operating across five phases.

The three tiers

Tier 1 — the Orchestrator sits at the top. Its job is to manage context across the entire project: maintaining the project record, summarising phase outputs, coordinating handoffs between phases, and escalating to human decision-makers when something is genuinely ambiguous or consequential. The Orchestrator does not make design decisions. It manages the conditions under which design decisions are made.

Tier 2 — Phase Agents each own a single phase: Discovery, Shaping, Betting, Building, or Retrospective. They receive inputs from the Orchestrator, coordinate the Task Agents within their phase, and produce structured outputs for human review at the phase exit.

Tier 3 — Task Agents are narrow specialists. A constraints analysis agent. A user needs synthesis agent. A decision logging agent. A scope monitoring agent. Each does one thing within its phase and produces a structured output that feeds directly into the Phase Agent's synthesis. Their narrowness is a feature, not a limitation — narrow scope means clear accountability and containable failure.

The five phases

The phases follow the Shape Up methodology's core rhythm: discovery shapes the problem, shaping produces a pitch, betting commits the scope, building executes it, and the retrospective evolves the framework itself. Each phase has explicit exit criteria — conditions that must be met before the framework advances.

This matters because it prevents the most common failure in AI-assisted workflows: phases that technically complete but don't actually resolve anything. Exit criteria force resolution. They prevent the appearance of progress masking the absence of it.

Why the structure matters

The three-tier hierarchy is not bureaucracy. It is containment. An agent that operates with too broad a mandate will drift. An agent that receives poorly scoped inputs will hallucinate plausibly. The hierarchy creates the conditions under which each agent can be reliable: clear scope, well-formed inputs, structured outputs, and a human review layer at every meaningful transition.

The hierarchy creates the conditions under which each agent can be reliable. Clear scope, well-formed inputs, structured outputs, and a human review layer at every meaningful transition.

The AI is a layer. The process is the design.

Trust & governance

Human-in-the-loop, for real

"Human-in-the-loop" has become a phrase that means almost nothing. It describes everything from a pop-up confirmation dialog to a genuine decision authority structure. Most AI-assisted workflows claim human oversight and deliver a rubber stamp.

The framework takes a different position: human checkpoints must be hard gates — moments where the workflow literally cannot advance without a named human decision. Not a notification. Not a summary email. A decision, with a record of who made it and why.

What a genuine checkpoint looks like

Every meaningful phase transition requires a Confirmation Block — a structured AI output that captures the agent's current understanding of the problem, the assumptions it's carrying forward, and the decisions that remain open. The human reviewer's job is not to approve this as a courtesy. It is to read it adversarially: what is wrong here? What is missing? What has shifted?

Confirmation Block Discovery Phase · Awaiting human review

Problem statement: Users in the onboarding flow abandon at the document upload step due to perceived complexity and uncertainty about data handling. Primary signal: drop-off analytics + usability session recordings (n=12).
Assumptions carried forward: Primary persona: first-time users, low technical confidence. Success metric: abandonment reduction at upload step. Scope: native mobile only. Desktop parity not in this cycle.
Decisions deferred: Whether to redesign the upload UI or add contextual guidance only. Requires human resolution before Shaping phase begins.
Open questions: Legal sign-off on revised data handling copy. Dependency on backend team for file validation timeline.

If the human reads this and finds nothing to correct, they should be suspicious. Either the agent is accurately capturing a well-structured phase output, or the reviewer is not reading carefully enough. Both are possible. Develop the habit of assuming the second.

The drift problem

Drift is the failure mode nobody talks about, because it doesn't look like failure. It looks like progress.

Drift is the gradual divergence between the AI's working model and the team's actual intent. It's different from a single error — which is visible and correctable. Drift is cumulative: small misalignments that compound across phases until the gap between "what we're building" and "what we said we were building" becomes significant.

It shows up as framing drift (the problem statement shifts subtly), persona drift (the target user becomes different from the research), or priority drift (trade-offs resolved early quietly get re-weighted). By the time it's visible, several phases of work may be built on the wrong foundation.

The defences against drift are structural: a project context document updated and reviewed at each phase transition; original intent literally re-read at every phase gate, from a document not from memory; and confirmation blocks that explicitly surface the agent's current problem framing for human comparison.

The most important defence is cultural: a team environment where questioning the AI's framing is expected and rewarded, not treated as friction.

Context as a design discipline

Working with language models in a multi-phase workflow means accepting a real constraint: agents don't have memory in the way human team members do. Each operates within a context window — a bounded space of information it can actively process. When that window fills up, context is summarised, compressed, or lost.

This is not a bug. It is a characteristic of the technology, and designing around it is a genuine skill. The framework addresses it through four mechanisms:

Context Initialisation — a living project document maintained across all phases, re-injected at each phase start
Incremental Summarisation — structured compression of phase outputs that preserves decisions and open questions
Selective Injection — task-specific context packages that give each agent only what it needs for its scope
Context Auditing — a human-led review of context accuracy at each phase boundary

The skill of context architecture — knowing what information to carry forward, how to compress it without losing signal, when the context has drifted from reality — is one of the most underrated skills in AI-augmented design work. It is also one of the most teachable.

Leadership & practice

Facilitation

When AI is embedded in a design workflow, the facilitator's job doesn't get easier. It gets different.

The most visible change: the facilitator is no longer the primary processor of group information. AI handles much of the transcription, tagging, and preliminary synthesis. This frees attention — but it also creates new demands. The facilitator must now monitor not only the room but the system, tracking what the AI is doing with the group's input and preparing participants to engage with those outputs critically rather than passively.

There is a specific risk here. When AI summarises a two-hour workshop into a crisp set of insight clusters, participants tend to accept that summary as authoritative — because it looks organised and confident. The facilitator must actively resist this: prompting the group to interrogate the synthesis, identify what was lost in compression, and surface disagreements the system may have smoothed over.

The facilitator must monitor not only the room but the system — and when the two contradict each other, the room wins.

What doesn't change

Holding purpose — AI can track topics but cannot hold intent. The facilitator owns why the session is happening.
Reading the room — no model can sense when a team is losing confidence, when a voice is being systematically deferred to, or when the group is about to commit to something they don't actually understand.
Judgment calls — when the AI's synthesis contradicts the facilitator's read of the room, the facilitator's judgment takes precedence. Always.

The norm worth protecting

The most important condition for AI-augmented facilitation is one that cannot be designed into the system: the normalisation of skepticism toward AI output.

In teams that work well with AI, questioning the system's output is not a sign of distrust or inefficiency. It is a sign of good professional judgment. "I don't think this captures what we actually said" should be a celebrated contribution, not an awkward one.

Building this norm requires explicit leadership. The facilitator must model skepticism — asking critical questions about AI synthesis in front of the group, praising pushback, and making it clear that the AI is a collaborator with a defined and bounded role, not a source of truth.

Skills & calibration

What this demands

AI augmentation doesn't simplify the designer's role. It changes it.

What decreases: time spent on certain production work — first-draft generation, structured synthesis, repetitive pattern application. These shift substantially to agents. What increases: time and attention required for direction-setting, critical evaluation, integration across phases, and the judgment calls that sit at the intersection of user need, business constraint, and design quality. What stays the same: the responsibility.

AI assistance does not transfer accountability. It changes the form of the work required to fulfil it.

The skills that design education hasn't caught up with

Prompt craft — setting direction for an agent with enough specificity to guide useful output, enough openness to allow genuine contribution. This is closer to brief-writing than specification. The skill is not in writing a longer, more detailed prompt. It is in writing a precise one.
Output evaluation — assessing AI output critically and quickly: what is right, what is wrong, what needs investigation. This requires both domain knowledge and a clear mental model of what the work is supposed to achieve.
Context architecture — deciding what information must carry forward across phases, how it should be compressed, and what can be safely left behind. A new kind of information design that senior designers are well-positioned to develop.
Drift detection — noticing when the AI's framing has shifted subtly from original intent. This requires keeping original intent visible and referring to it regularly — from a document, not from memory.
Calibrated trust — knowing when AI output is likely reliable and when it requires deeper scrutiny. Calibration develops with practice and should be made explicit across the team.

The framework as a living practice

A framework that is treated as finished stops improving. The retrospective phase is specifically designed to prevent this: at the end of each cycle, the team reviews not just the design work but the framework itself. What worked. What generated overhead without value. What needs to change.

The practices described here are the result of real experimentation — what has worked, what has failed, and what has been learned. They represent a current state, not a final one. Every team that adopts this framework will encounter situations it doesn't fully address. The right response is to adapt and document, not to follow mechanically.

To close

The framework is an attempt to make human judgment more effective, not to replace it. The AI layer exists to serve the process. The process exists to serve the design. The design exists to serve the people who use it.

That chain of accountability is the point of all of this. Not the agents, not the workflow, not the tooling. The chain.

← Back to Thinking View the case study →

Available

Your hardest problem
is a good place to start.

Open to senior product design roles and selective consulting engagements — in English or Spanish.

Connect on LinkedIn → Send an email →