KognitaKognita.

Blog

Claude Code Subagents Start From Zero Context — That Is the Whole Problem

10 min read

Subagents are sold as parallelism: fan work out, get more done at once. The detail that gets lost in the pitch is how little each subagent actually knows. As one walkthrough put it, "the subagent starts fresh — it has no memory of what you've been doing, no context from your main session, no inherited instructions unless you explicitly pass them." It gets a fresh, isolated context window and a one-line brief, and that is it. The parallelism is real. So is the cost: every subagent re-derives the context you already built.

What a subagent actually inherits

By design, a Claude Code subagent does not see your conversation. It starts clean — which is great for keeping the main context tidy and terrible for continuity:

A fresh window plus a sentence
What a Claude Code subagent inherits

  Gets:
    -> its own fresh, isolated context window
    -> a one-line delegation message Claude writes for it
    -> the CLAUDE.md hierarchy + a git status snapshot

  Does NOT get:
    -> your conversation history
    -> the files the main agent already read
    -> the bug you're actually chasing
    -> the corrections you made three messages ago

  It starts from zero, plus a sentence.

The isolation is deliberate — it is how the architecture stops a noisy task from flooding your main window. But it means the subagent does not know why it was called. It has the what ("review this file") and none of the context that makes the what meaningful.

The round trip loses information twice

Context is lost on the way in and again on the way out:

The why is dropped going in; the detail is compressed coming out
The round trip is lossy both ways

  IN:   you're refactoring auth because tokens expire early
        -> Claude delegates: "review auth.ts for issues"
        -> the WHY (the expiry bug) never reaches the subagent

  OUT:  subagent reads 6,100 tokens of files, does the work
        -> returns a 420-token summary
        -> the edge case it noticed, the file it flagged —
           compressed away before it reaches your session

A subagent might read thousands of tokens of files and return a few hundred — and the thing you most needed, the edge case it noticed in passing, can be exactly what the summary drops. This is context loss by architecture: the compression that protects your main window is also where the details that matter leak out.

The orchestrator becomes a context courier

The implied fix is "pass more in the brief." But now the main agent is responsible for hand-carrying every relevant fact into every subagent — the file paths, the conventions, the gotcha from earlier — and it does this from its own imperfect, compacting memory of the session. Subagents cannot nest, so you cannot push the problem down a level. Each delegation is a fresh chance to forget the one detail that mattered. You have not eliminated the context problem; you have multiplied it by the number of agents.

Shared grounding beats hand-passed briefs

The reason every subagent has to re-derive context is that the context lives in the conversation — ephemeral, per-session, un-shareable. Move the codebase knowledge out of the conversation and into a shared layer, and the briefs stop having to carry it:

Re-derive per hop vs. retrieve from one source
Hand-passed briefs vs. shared grounding

  Per-subagent briefs (today):
    -> context re-derived from scratch on every hop
    -> each agent re-reads, re-discovers, re-guesses
    -> the orchestrator must remember to pass everything

  Shared semantic index (Kognita):
    -> every agent retrieves the same current codebase truth
    -> no re-derivation; the index is already there
    -> the brief can be short because grounding isn't in it

When every agent can retrieve the same current representation of the codebase on demand, a subagent starting "fresh" is no longer starting blind — it pulls the grounding it needs the moment it needs it. The orchestrator's brief can be a genuine task description instead of a desperate attempt to pre-load a context window. This is the same argument as agents needing shared memory, not just context windows.

Where Kognita fits

Kognita maintains a semantic index of your repositories that any agent — main or sub — can query through MCP. The codebase truth lives in one shared, always-current place instead of being trapped in a single conversation that subagents cannot see. A subagent does not need to inherit your history to be useful; it needs to be able to ask the same questions about the same code and get the same current answers. That turns fan-out from "N agents each guessing in isolation" into "N agents grounded in one source."

Final take

Subagents start from zero context because their isolation is the feature. That is fine for keeping windows clean and costly for getting work done, because the thing they lack — knowledge of the codebase and the task — is exactly what they need, and the only channel to deliver it is a brief the orchestrator has to remember to write.

If every agent has to re-derive context, parallelism just parallelizes the guessing. Put the codebase truth in a shared layer and a fresh subagent can be grounded from its first token.