Part One

The Missing Argument

The question every agentic development framework answers second — and should answer first

Chapter 1

What the Agentic Discourse Gets Wrong

Every framework for agentic development — BMAD, Attractor, the twelve-factor agents approach, the RPI workflow — shares a silent assumption. It assumes you already know what to build. It assumes the domain is understood, the concepts are named, the boundaries are clear. It goes straight to the question of how to execute, without asking whether you have the clarity to execute correctly.

The Confident Wrong Answer

Here is the failure mode nobody talks about. An organisation adopts an agentic development framework. The agents are capable. The tooling works. The process runs smoothly. Specifications are written, stories are generated, code is produced at impressive velocity. And then — three months in — the business realises that the system being built reflects the development team's understanding of the domain, not the business's actual domain. The customer object in the ordering system is not the same concept as the customer object in the billing system. The agent built both — confidently, consistently, precisely — based on a model that was never challenged.

This is not a prompt engineering failure. It is not a framework failure. It is a domain modelling failure that happened upstream of the first prompt, before the first spec was written, before the first story was generated. The agent amplified a misunderstanding rather than a correct understanding. Speed made it worse, not better.

The Core Problem

Every agent framework, every harness engineering approach, every context engineering discipline — strip away the names and the goal is identical: give the LLM precise context, and structure the process precisely enough that the agent does not have to fill gaps with its own judgment. DDD is the discipline for achieving that precision at the domain level before you write a single prompt or spec. It is the missing first step in every agentic transformation discussion. Not because the frameworks are wrong — they're not. Because they start one step too late.

The Sales Order Customer and the Complaint Ticket Customer

Consider a business with a sales team and a customer service team. Both use the word "customer." In the sales context, a customer is a person or organisation who has signed a contract and is potentially due for renewal. In the customer service context, a customer is any person who has raised a support ticket — which may include contacts at a client organisation who were never part of the sales relationship at all.

Same word. Two different things. If your specification doesn't distinguish them, your agent will pick one interpretation and proceed confidently in the wrong direction. It will build a customer service system that queries the sales database and wonders why it can't find the end users who are calling in. Or it will build a sales renewal system that surfaces support contacts as renewal targets and generates nonsense outreach.

That is not an agent problem. That is a domain problem. The agent did exactly what it was told. It was told the wrong thing — not because anyone lied, but because nobody had done the work of making the distinction explicit before the specification was written.

What the Discourse Assumes You Have Already Done

Read any agentic development framework documentation and you will find the same implicit starting point: a clear problem to solve, a defined scope, a shared vocabulary, an understanding of what the system should do. BMAD's Brief session assumes the human can articulate the project clearly enough for the Analyst agent to structure it. Attractor's NLSpec assumes someone can write a specification that is complete enough for an agent to work from without guessing.

These assumptions are reasonable — if the prerequisite work has been done. The problem is that most organisations approach agentic development without having done it. They have years of accumulated misunderstanding about their own domain, encoded in legacy systems that nobody fully comprehends, expressed in terminology that means different things to different teams. They hand this confusion to a capable agent framework and expect it to produce clarity. It produces confident confusion instead.

Next: the four disciplines as a deliberate sequence — and what breaks when you skip any step.

Chapter 2

The Sequence, Not the Menu — Why Order Matters

DDD, Event Storming, BMAD, and Attractor are not four tools you pick from based on preference. They are four layers of a single discipline, each building on the one before it. Understanding them as a sequence rather than a menu changes how you use each one — and makes clear why so many agentic transformation efforts stall at the same predictable points.

The Four Layers

LAYER 1

Domain-Driven Design — The Conceptual Foundation

Establishes the vocabulary, the boundaries, and the model of the domain. Answers the question: what does this business actually consist of, and how do the parts relate? Without this, every downstream layer operates on an unexamined model that may or may not reflect reality.

LAYER 2

Event Storming — The Discovery Method

The practical workshop technique for surfacing the domain model from the people who know it. Translates DDD's concepts from theory into a specific business's reality. Without this, the domain model is the architect's assumption rather than the business's actual knowledge.

LAYER 3

BMAD — Structured Agentic Execution

The framework for turning a well-understood domain into working software through a structured multi-agent workflow with human oversight. Without the domain clarity from Layers 1 and 2, BMAD's artefact chain produces precise documents about an imprecise model.

LAYER 4

Attractor — The Lights-Out Factory

Spec-driven development at full autonomy. Only viable when the specification is so precise and complete that an agent can work from it without guessing. That precision is the output of Layers 1, 2, and 3 working in sequence. Without them, the factory produces at speed what nobody fully wanted.

What Breaks When You Skip a Layer

Skip Layer 1 and Layer 2, and jump straight to BMAD. The Brief session produces a project brief that reflects the human's unexamined mental model. The PRD encodes assumptions nobody challenged. The Architecture locks in boundaries that don't reflect how the business actually works. The resulting code is internally consistent and externally wrong. The framework worked perfectly. The input was wrong.

Skip Layer 1 and Layer 2, and jump to Attractor. The NLSpec is written with the same unexamined vocabulary that caused the problem in the first place. The same word appears in three sections meaning three different things. The agent picks one interpretation and implements it consistently at speed. The factory produces the wrong system very efficiently.

Skip Layer 2 only — go from DDD concepts directly to BMAD without the Event Storming discovery session. The DDD model is an architect's hypothesis rather than a model grounded in the actual business. It may be conceptually elegant and practically wrong — elegant because it was designed in the abstract, wrong because the real business has edge cases and exceptions and historical decisions that only surface when the people who live in the domain are in the room.

The Sequence Is the Architecture

Each layer produces something the next layer needs. DDD produces the conceptual framework. Event Storming produces the grounded domain model. BMAD produces the structured specification and artefact chain. Attractor produces the working software. Each layer's output is the next layer's prerequisite. This is not a stylistic preference — it is a dependency graph. You cannot have the output of Layer 3 without the input from Layer 2 without the foundation from Layer 1.

Harness Engineering — Not Spec-Driven Development

The term "Spec-Driven Development" has gained traction in the agentic development community as a label for the practice of writing specifications before implementation. It is not wrong, but it is incomplete — and in some uses it is actively misleading. Writing a specification is one component of a harness. It is not the harness.

A harness is the full collection of specifications, domain context, quality checks, workflow guidance, and transition conditions that controls the agent's how loop. The domain-ctx.txt is harness. The BMAD artefact chain is harness. The NLSpec is harness. The CLAUDE.md is harness. The holdout scenario suite is harness. "Spec-Driven Development" names one input to the harness and treats it as the whole. Harness Engineering names the complete activity — building, maintaining, and improving everything that makes the agent's how loop reliable.

The distinction matters because it changes what you do when output quality falls short. In the SDD frame, the instinct is to improve the specification. Sometimes that is right. But the specification may be fine — the gap may be in the quality checks, or the transition conditions, or the domain context. Harness Engineering asks: which component of the harness failed? SDD only asks: is the spec good enough?

The Progression Is Not "Trust the Agent More" — It Is "Build the Harness Better"

This is the observation that changes how you think about the five-level maturity framework — and it is one that the current discourse has not articulated clearly.

The common framing of the progression from Level 2 to Level 5 is increasing agent autonomy — you start by controlling every step and gradually step back as you trust the agent more. That framing is wrong in a way that matters. It makes the progression feel like a leap of faith. It implies that moving to Level 4 requires trusting the agent with things you used to verify yourself. Enterprise architects and risk-conscious engineering leaders hear this framing and stop. Rightfully.

The correct framing is harness maturity. Each level of harness maturity removes one class of human bottleneck — not because you trust the agent more, but because the harness now carries the knowledge and the checks that the human was previously providing manually.

The Harness Progression — Five Levels Reframed

Level 2 — No harness. Human is the harness.
The human carries all context, reviews every output, triggers every transition. The human is the bottleneck because there is nothing else to carry the load.

Level 3 — Harness emerging.
BMAD artefact chain, domain-ctx.txt, CLAUDE.md. Harness partially defined. Human resolves the gaps the harness does not yet cover. Bottleneck moves from every output to transition points between agents.

Level 4 — Harness mature.
NLSpec discipline, explicit phase control, shared context packages. Harness complete enough that the human reviews outcomes rather than steps. Bottleneck moves from transitions to outcome evaluation.

Level 5 — Harness complete. Dark factory.
Holdout scenarios handle evaluation. The human owns the why loop. The harness owns the how loop entirely. No human bottleneck remains — not because the agent is trusted blindly, but because every class of human verification has been encoded in the harness.

The progression is not "trust the agent more." It is "build the harness better." Each level of harness maturity removes one class of human bottleneck. The dark factory is not a leap of faith — it is the endpoint of a measurable engineering progression.

This reframing has a practical consequence for enterprise organisations. You do not need to decide how much to trust the agent. You need to decide how much harness you have built. If the harness covers the decision, the agent can make it reliably. If the harness does not cover the decision, a human needs to make it — not because agents are untrustworthy in the abstract, but because the specific knowledge required to make that decision has not yet been encoded in the harness.

Every experiment with agentic development is, in this framing, a harness engineering exercise. What knowledge did the agent need that wasn't in the harness? Add it. What check failed that should have been automatic? Build it. What transition condition was ambiguous? Specify it. The harness improves with every cycle. The dark factory is not the starting point — it is what the harness becomes when the improvement cycles are complete.

Each layer's output is the next layer's prerequisite. The sequence is a dependency graph, not a stylistic preference.

Next: Layer 1 in depth — what DDD contributes and why it cannot be skipped.

Part Two

The Four Layers Connected

What each layer contributes, what it needs from the layer before it, and what breaks without it

Chapter 3

Layer 1 — DDD: Domain Clarity Before Anything Else

Domain-Driven Design is twenty years old. It predates agentic development by two decades. The discourse around AI-driven software has largely ignored it — which is precisely why teams adopting agentic frameworks are hitting the same wall that DDD was invented to address.

What DDD Contributes to the Sequence

DDD's contribution to agentic development is not its tactical patterns — the Repositories, Factories, and Aggregates that belong to the early 2000s Java era. Those are implementation patterns with limited relevance to the current moment. DDD's contribution is its strategic patterns, and specifically three ideas that are prerequisites for any agentic approach to work correctly.

The first is Ubiquitous Language — the discipline of building a shared vocabulary between business people and developers, and enforcing it in the code. In an agentic context, the vocabulary must also be enforced in the specification. An agent working from a specification where the same concept is called three different names across three sections will treat them as three different concepts. Ubiquitous Language is not optional in an NLSpec — it is the mechanism that makes the specification internally consistent.

The second is Bounded Contexts — the discipline of drawing explicit boundaries around parts of the domain where a specific model and specific vocabulary applies. The sales customer and the service customer are different concepts in different Bounded Contexts. An agent that crosses that boundary without knowing it exists will corrupt both models. Bounded Context boundaries must be visible in the specification — they cannot be left implicit for the agent to infer.

The third is Subdomain classification — the discipline of identifying which parts of the domain are the organisation's competitive differentiator (Core Domain), which are necessary but generic (Supporting), and which are commodity problems with off-the-shelf solutions (Generic). This classification determines where to invest agentic development effort and where to buy or use existing solutions. A team that builds a bespoke agent-driven authentication system has spent significant investment on a Generic Subdomain. A team that leaves its Core Domain on legacy code while automating the Generic work has optimised in the wrong direction.

The DDD Connection That Nobody Has Made

Every agent framework requires precise context. DDD is the 20-year-old discipline for building that precision at the domain level. The sales order customer and the complaint ticket customer are the same word pointing at two different things. If the specification doesn't distinguish them, the agent picks one interpretation and proceeds confidently in the wrong direction. That is not a prompt engineering failure. That is a domain modelling failure upstream of the prompt. This is the connection the current agentic development discourse has not made — and it is the reason this guide series starts with DDD.

What DDD Produces That the Next Layer Needs

Layer 1 produces three things that Layer 2 depends on. A conceptual framework — the vocabulary of domains, models, contexts, and events — that gives the Event Storming session its structure. A set of questions — where are the context boundaries, what is the Ubiquitous Language of each area, what is the Core Domain — that the session is designed to answer. And the discipline of domain thinking itself — the habit of asking "what does this mean, precisely, in this context?" before assuming shared understanding.

Without Layer 1, an Event Storming session produces a wall of sticky notes that the team can't structure into a coherent model. The events are real. The boundaries are invisible because nobody has the framework to see them. The session produces energy without architecture.

Read the DDD Guide for the full treatment of Ubiquitous Language, Bounded Contexts, Context Maps, and Subdomain classification

Next: Layer 2 — how Event Storming turns DDD's framework into a specific business's grounded domain model.

Chapter 4

Layer 2 — Event Storming: Surfacing the Model in a Room

DDD gives you the framework for thinking about a domain. Event Storming gives you the method for applying that framework to a specific business, with the people who actually know it. The output is not a theoretical model — it is a grounded model that reflects real business complexity, real edge cases, and real boundary decisions made by the people who live in the domain.

The Problem Event Storming Solves in the Sequence

DDD's framework is powerful and abstract. The danger is that an experienced architect applies it to a domain they think they understand — drawing Bounded Contexts on a whiteboard, naming the Ubiquitous Language from memory, classifying subdomains based on their own judgment. The resulting model is intellectually sound and organisationally wrong. It reflects the architect's mental model of the business, not the business's actual reality.

Event Storming is the correction mechanism. It puts the people who know the domain — the operations manager, the customer service lead, the finance director, the warehouse manager — in the same room as the people building the software. The model that emerges from that room is not the architect's hypothesis. It is the combined knowledge of everyone who works in the domain, surfaced through the discipline of naming events and debating sequences.

What Event Storming Produces for the Sequence

A well-run Event Storming session produces three outputs that Layer 3 depends on directly.

First, a grounded Ubiquitous Language — not the vocabulary the architect assumed, but the vocabulary the business actually uses, tested against the disagreements and clarifications that surface in the session. When two people put up stickies for the same event using different words, the conversation about whether these are the same thing or different things produces a more precise language than any top-down glossary exercise.

Second, candidate Bounded Context boundaries — the seams where the language shifts, where the team and responsibility changes, where pivotal events mark major business transitions. These boundaries are not imposed by the architect. They emerge from where the domain experts naturally cluster, where the vocabulary naturally changes, where the Hotspot stickies accumulate most densely.

Third, a visible model of what the business actually does — the full sequence of Domain Events from end to end, with the parallel tracks, the exception paths, the policies that encode hidden business rules, and the hotspots that mark genuine complexity. This is the raw material that the BMAD Brief, PRD, and Architecture sessions need to produce specifications that reflect reality.

Event Storming as Domain Validation

Event Storming doesn't just surface the domain — it validates the domain model. When the operations manager and the developer put the same event in different positions on the timeline, and the resulting conversation reveals that these are actually two different events that have been collapsed into one concept, the model becomes more accurate. This validation happens before any specification is written, before any agent begins work. The cost of this discovery is a conversation. The cost of discovering it after implementation is a rework cycle.

The Bridge Between Layer 2 and Layer 3

The Design Level variant of Event Storming — the most detailed of the three variants — produces output that maps directly onto BMAD's artefact chain. The Aggregates identified in the Design Level session become the architectural foundation of the BMAD Architecture Document. The Commands and Events become the vocabulary of the Story Files. The Policies surface the business rules that must be encoded in the implementation. The Read Models define what information must be available at each decision point.

This is not a coincidence of terminology. Event Storming's Design Level and BMAD's Architecture session are addressing the same question from different directions: what does this software need to do, and what model should it express? Running Event Storming's Design Level before BMAD's Architecture session means the Architecture Document is grounded in validated domain knowledge rather than architectural assumption.

Read the Event Storming Guide for the full workshop process — Big Picture, Process Level, and Design Level — including facilitation, remote sessions, and the DDD mapping

Next: Layer 3 — how BMAD turns validated domain knowledge into structured agentic execution.

Chapter 5

Layer 3 — BMAD: Structured Execution at Levels 3 and 4

With a validated domain model from Layers 1 and 2, the team now has what BMAD's planning phase actually needs — a clear understanding of what the software should do, expressed in precise shared vocabulary, with boundaries that reflect how the business actually works. At this point, BMAD can function as designed rather than compensating for domain ambiguity it was never built to resolve.

What Changes When Layer 1 and 2 Are in Place

The BMAD Brief session changes character entirely when the domain model has been validated through Event Storming. Instead of the Analyst agent spending the session surfacing basic questions about what the project is for and who it serves, the human comes in with those questions already answered. The Brief can focus on scope, constraints, and the specific capabilities being built — not on untangling domain confusion that should have been resolved before the session began.

The PRD session benefits from the Ubiquitous Language. The PM agent and the human review build the requirements document using the terms the business actually uses, tested against real domain expert knowledge. The requirements are grounded in the same model that the Event Storming session produced. When the business analyst reads the PRD, they recognise the terminology as their own — not a developer's translation of their concepts into technical language.

The Architecture session benefits most dramatically. The Architect agent is working from a model where the Bounded Context boundaries are already known, where the Aggregates have been identified in the Design Level Event Storming session, where the integration patterns between contexts have been discussed and named. The Architecture Document is not building the model from scratch — it is translating a validated model into a technical specification.

The Story File as Context Package

BMAD's Story File — the atomic unit of development work — is structurally identical to the context package concept in advanced agentic development. It concentrates everything the Developer agent needs for one specific task: the relevant portion of the architecture, the acceptance criteria, the domain constraints, the integration requirements. When the domain model is clear, story files can be precise. When it isn't, story files carry the same ambiguity that caused the problem upstream.

This is why the RPI (Research-Plan-Implement) workflow from the 12 Factor Agents approach connects naturally here. The Research phase is where domain understanding is built before any code is written. DDD provides the framework for that research — knowing which Bounded Contexts are relevant, which entities have domain significance, where the consistency boundaries sit. RPI without DDD thinking produces research that is technically accurate but semantically shallow. With DDD thinking, the research phase produces the precise domain understanding that makes the plan phase concrete and the implementation phase reliable.

What Layer 3 Produces for Layer 4

A team that has run BMAD successfully across several projects has built two things that Layer 4 requires. The first is a mature, validated domain model — the accumulated output of multiple planning cycles that have been grounded in Event Storming and refined through implementation. The second is specification-writing discipline — the habit of writing artefacts precise enough for agents to work from, tested against the real consequences of imprecision in the Story File quality and implementation output.

These two things together are what makes NLSpec possible. You cannot write a 7,000-line specification for an agent to work from without both. The domain model tells you what to specify. The specification-writing discipline tells you how to make it precise enough to work.

Read the BMAD Guide for the full agent team, artefact chain, planning workflow, and scaling from solo to enterprise

Next: Layer 4 — when the factory becomes viable, and what it requires from the layers beneath it.

Chapter 6

Layer 4 — Attractor: When the Factory Is the Answer

The lights-out software factory is the endpoint of the sequence, not an entry point. StrongDM's three-person team built it after years of deep domain expertise in infrastructure security — a domain they know so precisely that they can write 7,000 lines of specification without guessing about what the business requires. That precision is not a coincidence. It is the output of accumulated domain clarity that Layers 1, 2, and 3 are designed to build.

Why Layer 4 Requires the Layers Beneath It

McCarthy's factory manifesto states the prerequisite plainly: "The bottleneck has shifted from implementation speed to spec quality. And spec quality is a function of how deeply you understand your system, your customers, and the problem." That deep understanding is not an assumption the factory makes. It is a requirement it enforces. An NLSpec written without domain clarity produces an agent that implements the ambiguity precisely and consistently — which is worse than an agent that asks clarifying questions, because the errors are harder to detect.

The holdout scenario suite — Attractor's mechanism for preventing specification gaming — requires the same domain clarity. Writing behavioural specifications the agent cannot see requires knowing what the system should do from the outside, in terms of observable business behaviour. That knowledge comes from the domain model built in Layers 1 and 2. Without it, the scenarios describe what the developer thinks the system should do — which may not match what the business actually needs.

The NLSpec as Culmination

Seen through the lens of the full sequence, an NLSpec is the culmination of the domain modelling work, not its replacement. The Ubiquitous Language from Layer 1 becomes the terminology that makes the specification internally consistent — the same word meaning the same thing in every section. The Bounded Context boundaries from Layer 1 and 2 become the structural boundaries of the specification — what falls inside this spec and what is explicitly out of scope. The Domain Events from the Event Storming session become the behavioural anchors — the things that happen in the business that the factory must produce and respond to.

When all of that is in place, the specification can be complete. Not complete in the sense of capturing every possible state — no specification does that. Complete in the sense that the agent never has to choose between two plausible interpretations of what was wanted, because only one interpretation is consistent with the model.

Layer 4 Is Not the Goal for Most Organisations

This is worth stating clearly. Layer 4 is the horizon, not the immediate target. The majority of enterprise organisations are at Level 2 today. The path to Layer 4 runs through Layer 1 and Layer 2 and Layer 3. Attempting to build an NLSpec for Attractor without the domain clarity that Layers 1 and 2 produce is attempting the hardest part of the sequence first, without the prerequisites. It produces an expensive and carefully maintained record of everything the organisation doesn't yet know about its own domain.

The Factory as Proof of Understanding

StrongDM's factory is significant not primarily because of the technology it uses. It is significant because three people understand their domain — infrastructure security — precisely enough to write a specification that an agent can implement correctly. The factory is proof of understanding, not a shortcut around it. Every organisation that wants to reach Layer 4 needs to build that understanding first. Layers 1, 2, and 3 are how you build it.

Read the Attractor Guide for the four architecture patterns, the NLSpec approach, the holdout scenario methodology, and the practical brownfield assessment

Next: a worked example — Meridian Retail moving through all four layers end to end.

Part Three

The Complete Picture

A worked example, an honest entry point guide, and the discipline that ties everything together

Chapter 7

A Worked Example — Meridian Retail End to End

Meridian Retail is a mid-size omnichannel retailer with online, in-store, and wholesale channels. They've decided to build a new returns management system — a process identified as one of their biggest operational pain points. This is how they move through all four layers.

Layer 1 — Applying DDD Thinking

Before the first workshop, the architecture team applies DDD's strategic framework to the returns domain. They identify that "customer" means different things to the customer service team (who handles the return) and the finance team (who processes the refund). They recognise that the returns process spans at least two Bounded Contexts: a Customer Service context where the return is initiated and approved, and a Finance context where the refund is processed and the revenue adjustment is recorded.

They classify the returns management capability as a Supporting Subdomain — it's important and complex, but it's not how Meridian competes in the market. Their Core Domain is demand forecasting and personalisation. This classification tells them that returns management deserves careful design but not the full Core Domain treatment — they should build it well, not brilliantly.

Most importantly, they identify the key question the Event Storming session needs to answer: does the returns approval process live in the Customer Service context or does it span both contexts? That seam is where the most complexity is likely to hide.

Layer 2 — The Event Storming Session

A Big Picture session brings together the customer service lead, the warehouse manager, the finance director, two developers, and the senior architect. Within ninety minutes of chaotic exploration, the wall reveals something the architecture team didn't know: the returns process has four distinct paths depending on the product category and the return reason. The customer service team refers to these as "the four streams" — but nobody had documented them, and the developers had never heard the term.

A Process Level session on the most complex of the four streams — high-value items with potential fraud flags — surfaces three policies that were entirely implicit: "Whenever a return request arrives for an order over £500, always flag for senior agent review." "Whenever a return is approved for a fraudulently used payment method, trigger a security alert." "Whenever a refund is issued within 24 hours of a new order by the same customer, hold the refund for manual review." These policies were business rules that lived in the heads of two senior customer service agents. They had never been written down.

The Design Level session maps the process to Aggregates: a ReturnRequest Aggregate in the Customer Service context that owns the approval workflow, and a Refund Aggregate in the Finance context that owns the payment processing. The integration between them is a Domain Event — ReturnApproved — published by the Customer Service context and consumed by Finance.

Layer 3 — BMAD Execution

The BMAD Brief session starts with the team in genuine alignment — they know the four streams, they have named the key Aggregates, they understand the integration pattern. The Brief is tight: this project builds the Customer Service context's half of returns management, with a defined integration contract to the Finance context. The PRD captures the four streams explicitly, including the three implicit policies that the Event Storming session surfaced. The Architecture Document reflects the Bounded Context boundary: Customer Service owns the ReturnRequest Aggregate; Finance owns the Refund Aggregate; the integration is event-driven.

The Story Files are precise because the model is precise. Story 4 — "Implement the high-value return fraud flag policy" — can embed the exact policy rule, the exact threshold (£500), the exact trigger conditions, and the exact acceptance criteria, because all of those were named and agreed in the Event Storming session. The Developer agent implementing Story 4 is not guessing about what "high-value" means. It's in the story file. The story file reflects the domain model. The domain model was validated by the people who actually work in the domain.

Layer 4 — The Factory Horizon

Meridian is not ready for Layer 4 today. But running Layers 1 through 3 across their returns management system has produced two things that move them toward it: a precise, validated domain model for one of their most complex processes, and a team that has practised writing specifications precise enough for agents to work from without guessing. When they've repeated this across their Customer Service context, their Ordering context, and their Finance context — when the model is mature and the specification discipline is embedded — the NLSpec for a factory run becomes possible. Not because the technology has changed. Because the domain understanding has been built.

Meridian Retail's journey through the four layers. The discoveries in Layer 2 changed what Layers 3 and 4 needed to build.

Next: the realistic entry points — where different organisations should start in the sequence.

Chapter 8

Where to Start — Realistic Entry Points

The sequence is a dependency graph. You cannot start at Layer 4 without the prerequisites that Layers 1, 2, and 3 provide. But most organisations don't start at Layer 1 either — they start wherever they are, which is usually somewhere in the middle. This chapter is an realistic guide to finding your starting point and moving forward from there.

If You Are at Level 1 or 2 — Start at Layer 1

If your organisation is using AI tools for autocomplete and occasional pair programming but hasn't made any structural changes to how software is developed, start with DDD. Not with a big design exercise — with a focused conversation about one domain you're planning to build software for. Pick a process. Map the concepts. Name the boundaries. Ask whether "customer" means the same thing to your sales team and your service team. The answers will be instructive.

Then run a Big Picture Event Storming session on that domain. Don't try to cover the whole organisation. Pick the Core Domain — the thing you actually compete on — and map it with the people who know it. What you learn will shape every technical decision that follows.

If You Have Domain Clarity But No Agentic Structure — Start at Layer 3

If your team already has strong domain knowledge — if the boundaries are understood, the vocabulary is shared, the model is validated through years of work in the domain — you can start BMAD immediately. The prerequisite work is already done. Your entry point is the Brief session. Run a pilot on a low-risk greenfield project. Get one complete cycle through Brief, PRD, Architecture, Stories, Implementation, and QA. Document what worked and what didn't. Then expand.

If You Have Legacy Systems With No Specification — Start at Layer 2

If your domain is brownfield — existing systems, accumulated behaviour, implicit business rules that nobody has documented — start with Event Storming as a documentation exercise before it's a design exercise. Run a Big Picture session on the existing system with the people who work in it. What you surface is not what you're going to build. It's what you need to understand before you can build anything. The output feeds a specification exercise that generates the domain model the brownfield system encodes but has never expressed.

If You Are Running BMAD Successfully — Think About Layer 4 Prerequisites

If your team has run BMAD across multiple projects and the artefact chain is mature, start asking the Layer 4 questions: Is your domain model stable and validated? Is your specification writing precise enough that agents rarely need to make interpretive decisions? Have you built the holdout scenario infrastructure that Attractor requires? Do you have digital twins for your key external integrations? If the answers are mostly yes, Layer 4 is on the horizon. If the answers are mostly no, you know what to build next.

The Most Dangerous Starting Point

Starting at Layer 4 without Layers 1, 2, and 3. Writing an NLSpec for a domain you haven't modelled. Running a factory on a specification that encodes unexamined assumptions. The factory will produce precisely and at speed. The output will reflect what you assumed, not what the business needed. The speed makes it harder to catch, not easier.

Next: the discipline that ties all four layers together — specification clarity as the single thread.

Chapter 9

The Discipline That Ties It All Together

Four layers. Four different tools. Four different communities of practice that mostly don't talk to each other. The DDD community rarely mentions BMAD. The agentic development community rarely mentions Event Storming. The harness engineering community rarely cites Evans. But they are all working on the same problem from different angles — and the thread that connects them is a single discipline: making intent explicit before asking anything to execute it.

Specification Clarity — The Single Thread

Evans called it knowledge crunching — the continuous, collaborative process of making domain understanding explicit. Brandolini called it the chaotic exploration — the act of getting implicit knowledge out of people's heads and onto a wall where it can be examined and challenged. BMAD calls it the artefact chain — the progressive refinement of intent from project brief to story file. McCarthy calls it spec quality — the depth of understanding encoded in natural language precise enough for an agent to work from without guessing.

Different names. The same discipline. The act of making what you know — about the business, about the domain, about what the software should do — explicit enough that something else can act on it reliably. That something else was once a human developer who could ask clarifying questions. It is now an agent that cannot.

Why This Discipline Is Harder Now, Not Easier

A human developer encountering an ambiguous requirement has options. They can ask the product manager. They can look at how similar cases were handled before. They can make a reasonable judgment and flag it for review. Their judgment is domain-informed — not perfect, but shaped by context, convention, and the ability to recognise when something doesn't feel right.

An agent has none of these options. It picks the most plausible interpretation and implements it confidently and completely. The output looks finished. The code compiles. The tests pass — because the agent wrote the tests against the interpretation it chose. The misunderstanding is encoded in working code that is hard to distinguish from correct code on superficial review.

This is why the specification discipline matters more in the agentic era, not less. Every assumption left implicit is a decision the agent makes without accountability. The blast radius of an underspecified context is larger and quieter than an underspecified requirement handed to a human team. The human team would have asked questions. The agent produced answers.

The 35-Year Perspective

The specification discipline has always been the scarcest resource in software engineering. Every experienced architect knows this. The hard part has never been writing code — it has been knowing precisely what code to write. The requirements gathering exercise, the design review, the architecture decision — these were always the leverage points, the places where a good decision prevented ten bad implementations and a bad decision caused ten good implementations of the wrong thing.

What the agentic era changes is the consequence of getting it wrong. A bad requirement given to a human team produces a misaligned implementation that takes weeks to build and days to identify and fix. A bad requirement given to a factory produces a misaligned implementation that takes hours to build, looks finished, and may take weeks to identify because everything about it is internally consistent. The speed multiplier works in both directions.

The Unchanged Truth

The discipline that Evans was teaching in 2003 is the discipline that McCarthy is requiring in 2025. Know your domain. Name your concepts precisely. Draw your boundaries deliberately. Validate your model against the people who live in it. Make your intent explicit before asking anything — human or agent — to execute it. The dark factory doesn't change this discipline. It makes the consequences of skipping it visible faster and at greater scale.

What the Series Has Built

This guide series is five documents, but it is one argument. Domain-Driven Design is the conceptual foundation — the framework for thinking clearly about a business domain before building software to serve it. Event Storming is the practical method — the workshop technique for surfacing that clear thinking from the people who have it. BMAD is the structured execution layer — the framework for turning validated domain knowledge into working software with appropriate human oversight. Attractor is the horizon — the lights-out factory that becomes viable when the domain understanding is precise enough and the specification discipline is mature enough. And this document is the argument for why these four are a sequence rather than a menu — why the order matters, why each layer enables the next, and why the discipline that ties them together is not a new idea but a very old one that the agentic era has made newly urgent.

On the Loop — Where the Human Belongs

Kief Morris at Thoughtworks (March 2026) introduced a vocabulary that maps precisely onto the discipline this series has been building toward. It is worth naming explicitly because it gives enterprise practitioners a clean way to explain their own role in an agentic development environment.

Three positions are possible. Outside the loop — the human owns the outcome, the agent owns everything in between. This is vibe coding at its extreme. The appeal is obvious. The failure mode is equally obvious: agents working without a harness spiral on messy codebases, compound errors, and produce technically correct output that is wrong about the domain. Outside the loop works for throwaway scripts and simple prototypes. It does not work for systems that need to be maintained.

In the loop — the human acts as gatekeeper at every agent step, inspecting each artefact, triggering each transition. This is the eight-hour BMAD session from the comparison video. The human is the bottleneck. Agents generate faster than humans can inspect. The productivity gain of the agent is absorbed by the overhead of the human gatekeeping every output.

On the loop — the human builds and maintains the harness that the agent runs. When output is wrong, the human improves the harness rather than correcting the artefact. The domain-ctx.txt is harness. The BMAD artefact chain is harness. The NLSpec is harness. The CLAUDE.md and AGENTS.md files are harness. The entire body of work in this guide series is harness engineering — defining the how loop precisely enough that the agent can run it reliably without human gatekeeping at every step.

Harness Engineering — The Human's Job in the Post-Agentic Era

The harness is the collection of specifications, constraints, quality checks, and workflow guidance that controls the agent's how loop. Building and improving the harness is the emerging practice Morris calls Harness Engineering. Every domain context file you write, every BMAD artefact chain you refine, every NLSpec section you make more precise — this is harness engineering. The harness is the accumulated learning of every experiment. It compounds over time in a way that individual prompt improvements do not.

The Agentic Flywheel — The Horizon

Morris describes what becomes possible when the harness is mature enough: agents that improve the harness itself. Feed the agent richer signals — pipeline results, test outcomes, production error logs, operational data — and it can analyse the performance of its own how loop and recommend improvements. Initially the human reviews recommendations and approves specific changes. As confidence grows, recommendations above a certain quality threshold are applied automatically.

This is the Attractor trajectory extended. The factory does not just run a harness — it evolves one. The human's role shifts further: from building the harness to steering the improvement of the harness. From on the loop to on the meta-loop. The why loop — the human's irreducible domain — is the same. The how loop becomes increasingly self-managing.

For most enterprise teams today this is the horizon, not the immediate target. The discipline described in this series — domain clarity before execution, specification precision before implementation, harness quality before autonomy — is the foundation that makes the flywheel possible when the organisation is ready for it. You cannot hand the harness to an agent to improve if the harness was never built with sufficient rigour to be evaluated. The sequence matters here too.

The Synthesis Discipline — Contextualise, Do Not Just Adopt

The Fowler article arrived in March 2026, a month after the domain context engineering and flowchart-first approaches in this series were first developed — independently and without reference to each other. Both arrived at the same underlying insight: the human's job is to define the how loop, not to run it. That independent convergence is validation of a kind that citing sources cannot provide. When practitioners working in different contexts arrive at the same structure, the structure is probably right.

But independent convergence is also a reminder of the discipline that enterprise practitioners need to maintain in the post-Agentic era. The agentic development space is producing a significant volume of frameworks, methodologies, and vocabulary. Some of it is genuinely new. Much of it is rediscovery of what experienced practitioners already know — specification before execution, context quality before agent autonomy, domain clarity before implementation. The COBOL teacher who required a system flowchart before lab time was doing harness engineering. They just did not have a name for it.

The right posture is neither wholesale adoption nor dismissal. Read what industry veterans are writing. Test it against your own experience. Where external framing improves on your own vocabulary, adopt it. Where your own context requires adaptation, adapt it. Where the external framework makes assumptions your environment does not satisfy, name that gap explicitly and work around it. This synthesis discipline — contextualise, do not just adopt — is itself the most durable skill in a space where the tooling changes faster than the underlying principles do.

The Guide Index and Glossary follow.

Reference

Guide Index and Glossary

A reference to all five guides in the series, plus definitions of the cross-cutting concepts that appear across multiple layers.

The Five Guides

Domain-Driven Design — A Practical Reference — Ubiquitous Language, Bounded Contexts, Context Maps, Subdomains, Entities, Aggregates, Domain Events, integration patterns. The conceptual foundation of the sequence.

Event Storming — A Practical Workshop Guide — Big Picture, Process Level, and Design Level sessions. Facilitation, sticky note vocabulary, remote workshops, failure modes, and the complete DDD-to-Event Storming mapping.

The BMAD Method — A Practical Guide — Agent team roles, the artefact chain, planning workflow, development cycle, human-in-the-loop design, scaling, greenfield vs brownfield, and failure modes.

Attractor — A Practical Guide — The dark factory, StrongDM's factory, NLSpec, directed graph phases, scenarios as holdout-set, digital twins, harness engineering and spec-driven development, the post-agile organisation, and the brownfield reality.

From Domain to Factory — The Synthesis Guide (this document) — The argument for why the four layers form a sequence, what each layer contributes, what breaks when layers are skipped, and the discipline that ties them together.

Cross-Cutting Glossary

Agentic Development

Software development in which AI agents perform significant portions of the implementation work autonomously — not just assisting developers, but writing, testing, and in some cases shipping code without human involvement at the implementation level. The five-level maturity framework (Shapiro) describes the range from Level 1 autocomplete to Level 5 lights-out factory.

Bounded Context

An explicit boundary within which a specific domain model and a specific Ubiquitous Language applies. Different Bounded Contexts may use the same word to mean different things. Making context boundaries explicit is the prerequisite for writing specifications that agents can process without ambiguity. See the DDD Guide, Chapter 4.

Context Package

The concentrated, curated information that an agent needs before beginning a task — equivalent to the ambient context a human developer carries from years of working in the domain. In BMAD, the Story File is the context package. In Attractor, the NLSpec is the context package. In both cases, context quality is the leading indicator of output quality.

Core Domain

The part of the business domain where the organisation actually competes — its source of differentiation. DDD argues that the Core Domain deserves the deepest modelling investment. In agentic development, Core Domain clarity is what makes precise NLSpec possible. Generic and Supporting domains should be bought or built simply; the Core Domain should be understood deeply. See the DDD Guide, Chapter 6.

Domain Event

Something that happened in the business that the business cares about. Past tense, business language. The primary unit of Event Storming's vocabulary (orange sticky). The mechanism by which Bounded Contexts communicate without tight coupling. In an NLSpec, Domain Events are the behavioural anchors — the things the factory must produce and respond to. See both the DDD Guide (Chapter 9) and the Event Storming Guide (Chapter 2).

Knowledge Crunching

Evans's term for the ongoing collaborative process by which domain experts and developers together build and refine shared domain understanding. The process that Event Storming operationalises. The prerequisite for specification clarity. The discipline that the agentic era has made newly urgent by raising the cost of skipping it.

NLSpec (Natural Language Specification)

A structured natural language document that serves as the control instrument for an agent-driven software factory. Requires domain clarity from Layers 1 and 2 to be complete. Requires specification-writing discipline from Layer 3 to be precise. The culmination of the four-layer sequence, not a shortcut around it. See the Attractor Guide, Chapter 4.

Specification Clarity

The discipline of making intent explicit before asking anything — human or agent — to execute it. The single thread that runs through all four layers of the sequence. Ubiquitous Language is specification clarity at the vocabulary level. Event Storming is specification clarity at the domain model level. BMAD's artefact chain is specification clarity at the project level. NLSpec is specification clarity at the factory level.

Ubiquitous Language

A shared vocabulary, developed collaboratively between business experts and developers, used consistently in all conversations, documentation, and code within a Bounded Context. In agentic development, Ubiquitous Language must also be enforced in the specification — an NLSpec where the same concept is called three different names across three sections will produce a system with three different concepts where one was intended. See the DDD Guide, Chapter 3.

The Four-Layer Sequence

The core argument of this guide: DDD (domain clarity), Event Storming (domain discovery), BMAD (structured execution), Attractor (factory) form a dependency sequence rather than a menu of options. Each layer's output is the next layer's prerequisite. Skipping any layer transfers its cost to a later stage where it is more expensive to address.

Why Harness Engineering Rather Than Spec-Driven Development

Spec-Driven Development names one component of the harness — the specification — and treats it as the whole. Harness Engineering names the complete activity: building and maintaining the full collection of specifications, domain context files, quality checks, workflow guidance, and transition conditions that controls the agent's how loop. The distinction matters in practice: when agent output quality falls short, SDD asks "is the spec good enough?" Harness Engineering asks "which component of the harness failed?" The answer is frequently not the specification.

Harness Engineering

The emerging practice of building and maintaining the collection of specifications, constraints, quality checks, and workflow guidance that controls an agent's how loop. Named by Kief Morris (Thoughtworks, March 2026). The domain-ctx.txt, the BMAD artefact chain, the NLSpec, the CLAUDE.md — these are all harness artefacts. The human's job in an on-the-loop position is to improve the harness rather than correct individual agent outputs. The harness is the accumulated learning of every agentic experiment, compounding over time.

On the Loop

The human position in agentic development where the human builds and maintains the harness rather than gatekeeping every agent output (in the loop) or delegating everything to the agent (outside the loop). The human defines the how loop precisely enough for the agent to run it reliably, and improves the harness when output quality falls short rather than correcting individual artefacts. First described by Kief Morris (Thoughtworks, March 2026) as the productive middle ground between vibe coding and micromanagement.

Agentic Flywheel

The stage of agentic development maturity where agents analyse the performance of their own how loop and recommend — or automatically apply — improvements to the harness. Requires a mature harness with rich evaluation signals: test results, pipeline outcomes, operational data, production error logs. The human role shifts from building the harness to steering its improvement. Described by Kief Morris (Thoughtworks, March 2026) as the next evolution beyond on-the-loop harness engineering. Corresponds to the Attractor trajectory extended: not just running a harness, but evolving one.

Why Loop / How Loop

A framework for understanding human and agent roles in software development, introduced by Kief Morris (Thoughtworks, March 2026). The why loop is the human-owned cycle of turning ideas into outcomes — the business intent, the domain requirements, the definition of what success means. The how loop is the agent-runnable cycle of turning specifications into working software — the implementation, the testing, the iteration. The on-the-loop position places the human at the boundary between them: owning the why loop, defining and maintaining the how loop, without personally running the how loop step by step.