# The Two-Agent Method: Building Features Without Drowning in Context


There's a particular kind of frustration that comes from working with AI coding assistants on anything beyond trivial features. You start with a clear goal. The AI seems to understand. It explores your codebase, asks a few questions, maybe even proposes something sensible. Then, somewhere in the implementation, things drift. The code doesn't quite integrate properly. Edge cases get missed. The assistant confidently suggests changes that would break existing functionality.

You find yourself debugging the assistant's confusion rather than shipping features.

The problem isn't that these tools are useless - they're remarkably capable when used correctly. The problem is that most of us are using them wrong. We're treating them like patient junior developers who can hold an entire conversation in their head, when they're actually more like brilliant consultants with severe short-term memory constraints.

Once you understand this, a better approach becomes obvious.

## The Real Problem: Context Accumulation

When you have a long conversation with an AI assistant, every detour leaves a mark. Every wrong file it explored, every assumption it made and corrected, every dead-end investigation - all of it stays in the conversation history. The model doesn't forget these things the way a human might. Instead, it accumulates them.

Think of it like trying to focus on writing while someone reads aloud every draft you've ever written, including all your false starts and deleted paragraphs. The signal-to-noise ratio degrades steadily.

This is why implementations often deteriorate after seemingly productive planning sessions. The assistant reached a correct conclusion, but only after wandering through a forest of incorrect ones. When it tries to implement, it's carrying all that baggage.

The solution isn't to use these tools less. It's to structure your workflow around their actual capabilities.

## The Two-Agent Pattern

Here's the approach that consistently produces better results: split the work between two separate agents with distinct, focused roles.

The first agent - call it the Explorer - does exactly what the name suggests. It navigates your codebase, asks clarifying questions, maps out integration points, and produces a clean implementation plan. This agent's job is to understand deeply and document clearly.

The second agent - the Implementer - starts fresh. It receives the Explorer's distilled insights without any of the exploration baggage. Clean instructions, relevant context, no detours. This agent writes the actual code.

The pattern resembles how effective engineering teams actually work. Senior engineers who understand the system architecture don't necessarily write every line of code. They explore, they plan, they document their findings. Then someone else - or themselves with fresh focus - implements based on that clear blueprint.

## Phase One: Exploration With Constraints

Start by framing the problem clearly. This matters more than you might think. The quality of the final implementation depends heavily on how well you define the task upfront.

Give the Explorer autonomy to navigate your codebase, but constrain its exploration to relevant areas. If you're adding a feature to your authentication system, point it toward the auth modules explicitly. This prevents the assistant from wandering through your entire repository and burning context window on irrelevant code.

The instruction might look like: "Explore the authentication and user management modules. I need to add OAuth integration for Google and Microsoft. Read whatever files you think are relevant in these areas."

Then add something critical: "If you're not completely confident about the solution, ask questions. Don't make assumptions."

This single instruction changes the assistant's behavior dramatically. Most models will happily fill gaps with assumptions if you don't explicitly tell them otherwise. But when forced to acknowledge uncertainty, they ask surprisingly good questions.

Answer these questions thoroughly. The time you invest here determines how much backtracking you'll do later. If the assistant asks whether soft-deleted users should still authenticate, don't just say "no" - explain the entire deletion lifecycle and how it affects related systems.

The Explorer continues iterating - reading more code, asking follow-up questions, refining its understanding. Eventually, it reaches confidence: "I understand what needs to be built. Here's the plan."

## The Critical Transition

This is where most people make their mistake. They see a good plan and immediately say "great, implement it" - keeping the same conversation going.

Don't do this.

Instead, instruct the Explorer to write documentation for a completely different engineer. Not instructions - documentation. The difference matters.

Instructions sound like: "Add a deleted_at field to the User model. Modify UserService.get_active_users() to filter out deleted users."

Documentation sounds like: "The User model in models/user.py represents both active and deleted users. There's currently no soft-delete mechanism - the existing delete() method removes records entirely. For this feature, you'll likely want to add a nullable deleted_at timestamp. The UserService in services/user_service.py has methods that return user collections - these may need updating to respect the deletion state. The admin interface renders user lists from admin/users.py, which pulls from UserService."

See the difference? Documentation guides thinking. It highlights relevant files and suggests approaches without dictating implementation details. It preserves the Implementer's ability to think independently.

Have the Explorer include:

- Relevant file paths and what they contain
- Integration points and why they matter
- Potential approaches and their tradeoffs
- Your specific requirements and constraints
- Your coding style preferences

This becomes your handoff document.

## Phase Two: Fresh Implementation

Start a new conversation. Copy in the Explorer's handoff document. Add your original feature request for context.

Then watch something interesting happen.

If the Explorer did its job well, the Implementer will read through the suggested files, understand the architecture quickly, and say: "I'm ready to implement this."

This moment is actually a quality signal. When an Implementer can immediately start working from a handoff document, it suggests the documentation was clear and complete. When it needs extensive clarification, something was missing in the exploration phase.

The Implementer now builds the feature with several advantages:

- Clean context focused only on relevant information
- No cognitive debris from exploration detours
- Fresh perspective on integration challenges
- Full autonomy to make implementation decisions

The resulting code tends to be cleaner, more consistent with existing patterns, and better integrated with the broader system.

## Why This Actually Works

Large language models don't "forget" in the human sense - they process every token in their context window with roughly equal weight. A confused exploration thirty messages ago carries nearly the same influence as the current task description.

By resetting context, you're not just saving tokens. You're removing misleading signals that would otherwise contaminate the implementation.

Think of it like the difference between:

- A meeting where you discuss something for two hours, exploring every wrong answer before finding the right one, then trying to make decisions while mentally replaying the entire rambling conversation
- A meeting where someone walks in with a clear one-page summary of the decision and the relevant context

The second meeting produces better outcomes because everyone's working from filtered, relevant information.

## A Concrete Example

Suppose you're adding export functionality to a data platform. Your existing system has complex permission rules, multiple data formats, and several edge cases around sensitive information.

The Explorer would:

- Examine your permission system to understand access control patterns
- Review existing export code (if any) to maintain consistency
- Identify where sensitive data filtering happens
- Ask about specific requirements: file format preferences, size limits, async vs sync processing
- Investigate how similar features handle errors and logging
- Produce a plan that covers data pipeline, permission checks, format conversion, and error handling

The handoff document might note: "The existing report generation in services/reports.py follows an async pattern using Celery tasks. This might be worth considering for exports since large datasets could timeout on synchronous requests. Permission checks happen at the service layer, not the API layer - see how ReportService.get_report() handles this. The data sanitization utilities in utils/sanitize.py are used consistently across the codebase for removing sensitive fields."

The Implementer reads this, examines the relevant files, and builds an export feature that naturally fits your architecture. It uses Celery because that's your pattern. It checks permissions at the service layer because that's where you do it. It uses your existing sanitization utilities because those exist.

The alternative - having a single agent do everything - often results in exports that don't quite match your patterns. Maybe it checks permissions at the wrong layer. Maybe it uses synchronous processing when you've standardized on async. These aren't catastrophic failures, but they create the kind of inconsistency that makes codebases harder to maintain.

## When to Use This Pattern

Not every task needs this ceremony. Adding a simple utility function? Fixing an obvious bug? A single agent works fine.

This pattern pays dividends when:

- The feature touches multiple parts of your system
- Integration matters more than the individual pieces
- You need the new code to feel native to the existing codebase
- The problem requires real understanding of your architecture
- You've found yourself repeatedly explaining context to the assistant

As a rough heuristic: if you'd want a senior engineer to review the design before implementation in a human team, use two agents.

## Common Variations

Some people use this pattern with even more specialization:

- An Architect agent that only reviews designs
- A Reviewer agent that critiques implementations
- A Documenter agent that writes end-user documentation

The underlying principle remains the same: keep each agent's context clean and focused on its specific role.

## The Hidden Benefit

Beyond the obvious quality improvements, this pattern has a subtler advantage: it forces you to think clearly about what you're building.

Writing a good handoff document requires understanding the feature yourself. When the Explorer asks clarifying questions, you can't hand-wave - you have to provide real answers. The transition point becomes a natural checkpoint for "do we actually know what we're building?"

Many bugs get prevented not because the AI wrote better code, but because this process caught ambiguities before they became implementation.

## Practical Notes

A few details that matter in practice:

Start the Explorer's instructions with your problem framing and relevant directory suggestions. Don't make it search your entire repository.

When the Explorer asks questions, take them seriously even if they seem obvious. Sometimes obvious questions expose subtle inconsistencies in requirements.

The handoff document doesn't need to be long. Three or four paragraphs often suffice. The goal is clarity, not completeness.

If the Implementer asks extensive clarification questions about the handoff, that's feedback - the Explorer missed something. Consider refining the exploration process.

Your coding style preferences (formatting, documentation standards, naming conventions) should be in the handoff. The Implementer won't inherit them through osmosis.

## Limitations

This approach doesn't solve everything. The Implementer can still write bugs. Integration might surface unexpected issues. Code reviews remain necessary.

What it does solve is the context contamination problem - and that alone makes complex features dramatically more reliable to build with AI assistance.

## The Broader Pattern

Once you see this pattern, you'll notice it applies beyond coding. Any complex task with an AI assistant benefits from similar structure: a research phase that produces clean documentation, followed by an execution phase that works from that documentation.

The specific implementation varies by domain, but the principle holds: cognitive clarity beats raw capability. A moderately capable model working from clean context often outperforms a more powerful model drowning in conversational debris.

---

*(Written by Human, improved using AI where applicable.)*