KognitaKognita.

Blog

Software Architects Can't Enforce Standards They Can't See

10 min read

You wrote the architecture decision record. You ran the design review. You got agreement on the service boundaries, the event-driven communication patterns, the rules around shared vs. duplicated logic. Three months later, you are reviewing a post-mortem and notice that four services are calling each other directly in ways that explicitly violate the pattern you established. Nobody decided to violate it. Each PR looked reasonable in isolation. The pattern simply drifted.

This is the core problem in architectural governance, and AI-accelerated development has made it dramatically harder to manage. Before AI coding tools, a principal engineer who reviewed ten PRs per day on critical paths could maintain a reasonable approximation of the system state in their working memory. The velocity of code production was bounded by human capacity, and the same humans who reviewed PRs could track patterns across them. At AI-accelerated velocity, that model collapses. Twenty PRs merge while you are reviewing ten. The mental model you are maintaining has a forty-eight-hour lag at minimum, and in that lag, architectural drift compounds.

What architects are actually trying to prevent

Architectural governance is not about enforcing style preferences. It is about preventing the classes of problems that are expensive to fix once they are structural: circular dependencies between services that make independent deployment impossible, duplicated logic that diverges subtly and causes inconsistent behavior, boundary violations that create hidden coupling that surfaces as cascading failures during incidents, and pattern proliferation that turns every change into an archaeology project.

These problems share a common characteristic: they are cheap to fix when they are introduced and expensive to fix after they have been built on. A direct service call that violates the defined event-driven pattern costs two hours to remediate the week it is introduced. After six other services have adopted the same pattern because AI sessions surfaced it as an existing codebase convention, the remediation is a cross-team project measured in weeks.

Before AI coding tools, architects relied on the natural velocity constraints of human development to keep up with drift. A team of fifteen engineers merging six PRs per day produces roughly thirty PRs per week. A principal engineer reviewing on critical paths can achieve meaningful coverage at that volume — not exhaustive coverage, but enough to catch the patterns that are compounding in the wrong direction. At sixty PRs per week, coverage falls. At a hundred and twenty, it collapses entirely.

Why AI-accelerated development makes architectural governance a velocity problem

AI coding tools increase the rate of code production without increasing the rate of architectural review. This is not a criticism of the tools — increasing developer throughput is their purpose. But throughput without proportional oversight is how architectural debt compounds.

There is a second, subtler problem. AI coding sessions learn patterns from the codebase they have access to. When a developer asks Cursor or Claude Code "how do I call another service from here," the AI surfaces existing patterns from the indexed codebase. If the most recent and prominent pattern in the index is a direct service call — because PaymentService introduced one last month and it is prominent in the context window — the AI recommends that pattern. The developer follows the recommendation. The pattern spreads.

This is architectural drift with an amplification mechanism built in. Before AI, drift spread at the pace of human engineers reading each other's code and choosing to copy patterns. With AI, drift spreads at the pace of AI sessions recommending patterns based on recency and frequency in the local index. An incorrect pattern introduced today becomes an AI-recommended convention in thirty days. By the time the principal engineer reviews the third service that adopted it, it has been in the codebase long enough that "this is how we do it here" is technically accurate, even if it was never intended to be.

The three types of drift that compound fastest

Pattern drift

Pattern drift is the proliferation of distinct implementations of the same conceptual concern. One authentication middleware becomes four. One error handling pattern becomes seven. One approach to database transactions becomes three, each with subtly different behavior under concurrent load. The problem is not that any individual implementation is wrong — each was introduced by a competent engineer solving a real problem. The problem is that each subsequent change that touches this concern must either pick one of the four implementations, introduce a fifth, or spend time understanding why four exist before making a decision. The complexity tax compounds with each new implementation.

Pattern drift is especially fast with AI tools because AI sessions have no cross-session memory. A developer using Cursor today asks how auth token validation works. Their session surfaces the implementation in the files they have open. A different developer using the same tool next week asks the same question and gets a different answer because their local context includes different files. Both are confident they are following existing practice. Both may be creating or extending a divergent pattern.

Boundary violations

Service boundaries are defined because the boundaries encode a dependency management decision. The rule "UserService is the only service that reads the users table directly" is not arbitrary — it ensures that changes to the user data model have a single point of modification, a single owner, and a single test surface. When AnalyticsService adds a direct database call to the users table because it is faster than going through the UserService API, the boundary is violated. The violation looks innocuous in the PR. Its consequences surface when the users table schema needs to change and the engineer responsible discovers there are now three other services making assumptions about its structure.

AI coding tools introduce boundary violations at the suggestion level. When a developer asks for the fastest way to get user data, the AI session does not know whether a direct database call violates a defined boundary — that architectural decision lives in an ADR document and in the principal engineer's head, not in the code itself. The AI gives the technically correct answer to the question asked. The boundary violation is invisible to both the developer and their AI session.

Shared vs. duplicated logic

Deciding what to share and what to duplicate is one of the more context-sensitive architectural judgments. Shared utilities create coupling; duplication creates divergence. The right answer depends on how likely the logic is to change, how independently the consuming services need to evolve, and whether the abstraction is stable enough to be worth the coordination overhead of sharing.

At AI-accelerated velocity, duplication happens because AI sessions generate utilities locally rather than discovering existing ones across the codebase. A developer asks Cursor to write a currency formatting function. Cursor generates one. The same request made by a different developer in a different service generates a functionally identical but distinct implementation. Neither developer knew the other existed. The code is not wrong. The duplication is not immediately harmful. But when a currency regulation change requires updating the formatting logic, it touches three services instead of one, and the three implementations may have diverged subtly in the intervening months.

Why PR review does not scale as the primary enforcement mechanism

PR review is the mechanism architects rely on to catch drift before it merges. The problem is that PR review is designed to evaluate the correctness of one change in isolation. Architectural drift is a cross-change, cross-time phenomenon. No single PR introduces drift — drift is what happens across fifty PRs over three months, none of which was individually alarming.

The cognitive load problem is also real. A reviewer approving a PR has the diff in their working memory, the immediate context of the PR description, and perhaps the last few PRs they reviewed the same day. They do not have a real-time map of where every pattern exists across the codebase, when each pattern was introduced, and which patterns are diverging. Maintaining that map manually requires reading volume that is not compatible with a full engineering workload.

There is also a timing problem. Architectural drift is most cheaply fixed at introduction. By the time it surfaces in a PR review — assuming the reviewer catches it — the PR has already been written, reviewed for correctness, and may have dependent work in flight. The friction of reversing it at that stage is higher than preventing it would have been. And a reviewer who catches one problematic pattern in a hundred PRs is not catching the ninety-nine others that were introduced while they were looking elsewhere.

How architectural drift compounds across 6 months of AI-accelerated development without continuous visibility
How architectural drift compounds over 6 months of AI-accelerated development:

  MONTH 1
  -> PaymentService begins calling UserService directly for billing address
     (the defined pattern was an async event; no one noticed in PR review)
  -> 2 new auth middleware implementations introduced — team now has 3

  MONTH 2
  -> 4 more services adopt the direct-call pattern after seeing PaymentService
     (AI sessions surfaced it as an existing pattern in the codebase)
  -> NotificationService duplicates the email templating logic from
     MessageService — undetected; both authors used AI, neither knew the other existed

  MONTH 3
  -> Direct-call pattern is now in 7 services; it has become de facto standard
  -> 3 auth middleware implementations — 2 new engineers joined and each
     asked their AI session how auth is done; got different answers
  -> DataPipeline begins bypassing the defined schema validation layer
     "just for internal events" — AI suggested it as a performance optimization

  MONTH 4-5
  -> Architecture review catches the direct-call pattern — remediation
     scoped at 3 weeks of work across 4 teams
  -> Second email templating divergence found — 3 implementations now exist
  -> Auth middleware inconsistency causes a security audit finding

  MONTH 6
  -> 4 auth implementations; the original is now the minority
  -> DataPipeline bypass copied into 3 other internal event flows
  -> Estimated remediation cost: 11 engineer-weeks
  -> If caught at Month 1: 2 engineer-hours

The timeline above represents a real compounding dynamic, not a worst-case scenario. The pattern starts with one boundary violation that goes unnoticed. It spreads because AI sessions surface it as an existing pattern. By the time it is caught, it is in seven services and requires cross-team remediation. The cost difference between catching it at introduction and catching it at month four is measured in engineer-weeks.

What semantic system visibility gives architects that code review does not

Semantic system visibility is a continuously maintained map of how the system actually looks — the call graph between services, the pattern frequency per concern, the duplication across boundaries, and the divergence between architectural intent and current implementation. It is not a static architectural diagram. It is a living index updated as the codebase evolves.

The critical difference from PR review is that semantic visibility is cross-change and cross-time. An architect who can query "show me all services that have introduced direct database calls outside their defined ownership boundary in the last ninety days" is doing architectural governance at the system level, not the PR level. They are catching the class of problem, not hoping to catch each instance as it arrives in their review queue.

Connecting this visibility to the work management system adds another layer. When an architect can see that a Jira epic introduced three boundary violations and five new pattern variants, they can have a conversation about the epic's delivery — not just flag individual PRs. The relationship between intent and implementation becomes queryable. "What was the API redesign epic supposed to do, and what did it actually change in the service graph?" is an answerable question when work tickets and codebase state are connected.

What is invisible in PR review but visible in a semantic system map
What is invisible in PR review but visible in a semantic system map:

  PATTERN DRIFT (invisible in PR review)
  -> A PR introduces one new auth implementation. Looks reasonable in isolation.
     What is invisible: it is the fourth distinct implementation. The reviewer
     does not have the full system in their working memory while reviewing
     one diff.

  BOUNDARY VIOLATIONS (invisible in PR review)
  -> A PR adds a direct database call from ServiceA into ServiceB's schema.
     The PR adds a comment: "temporary, will refactor." Reviews pass.
     What is invisible: the same comment appeared in 6 other PRs this year.
     None of them were refactored.

  SHARED VS. DUPLICATED (invisible in PR review)
  -> A PR introduces a new currency formatting utility. Tests pass, code is clean.
     What is invisible: a functionally identical utility exists in 2 other
     services. Future behavior divergence is now structural.

  ARCHITECTURAL INTENT GAPS (invisible in PR review)
  -> A PR restructures how OrderService computes shipping cost.
     The reviewer checks correctness. They do not check whether this
     computation was intentionally kept inside OrderService to preserve
     a specific transactional boundary defined 18 months ago.

  VISIBLE IN A SEMANTIC SYSTEM MAP
  -> All four patterns above are detectable in a system-level semantic index:
     cross-service call graph, pattern frequency per concern, utility duplication
     across boundaries, and transactional boundary provenance.

The four blind spots listed above — pattern drift, boundary violations, duplicated logic, and architectural intent gaps — share a common property: they are only visible in context of the whole system. Each individual PR that contributes to them passes a reasonable review. It is only when you can see the pattern across all fifty related PRs that the problem becomes apparent. Semantic system visibility provides that cross-PR, cross-service context as a queryable interface rather than as something an architect must reconstruct manually from memory and periodic code reads.

Final take

Architectural governance enforced exclusively through PR review was always imperfect. At AI-accelerated development velocity, it is no longer viable as a primary mechanism. The velocity that makes AI tools valuable — the twenty PRs per day that would have been eight — is also the velocity that makes manual drift detection impractical. The two problems are the same problem.

The architects who are managing this well are not doing more code review. They are building infrastructure that makes drift detectable at the system level rather than at the PR level. The specific patterns they define — event bus over direct calls, service ownership of data, no utility duplication across service boundaries — become queryable constraints that surface violations automatically rather than requiring a human to catch them while approving a diff.

Semantic architectural query vs. traditional review process — how Kognita changes the governance model
Architectural review: semantic query vs. traditional process

  TRADITIONAL PROCESS (principal engineer's weekly review)
  -> Read PRs on critical paths (10/day max with other responsibilities)
  -> Maintain a mental model of where patterns are used across the system
  -> Spot-check services during incident postmortems
  -> Flag architectural concerns in async Slack threads
  -> Result: catches ~30% of drift; the rest accumulates until it is structural

  KOGNITA SEMANTIC QUERY (continuous, no manual work)

  "Show all services that call other services directly rather than
   through the defined event bus, introduced in the last 90 days"

  -> Returns: 7 services, 14 direct calls, first introduced 68 days ago
     by PaymentService (PR #4821). Callers by recency.

  "Show all implementations of auth token validation across the codebase"

  -> Returns: 4 distinct implementations with call sites, last modified dates,
     and the Jira tickets that introduced each one. Earliest is the canonical
     implementation per ADR-2023-11.

  "Which services access the users table outside of UserService?"

  -> Returns: 3 services with direct access, 2 introduced in the last
     60 days. Linked to the PRs and Jira epics that introduced them.

  -> Result: catches drift at introduction, not at incident

The shift from "review PRs and hope to catch drift" to "query the system state and catch classes of drift" changes the economics of architectural governance at AI-accelerated velocity. Kognita provides the semantic index that makes that shift possible — maintained automatically, connected to work management context, queryable in plain language without requiring a developer to read thousands of lines of code to answer basic questions about system state. Architects define the standards. The system surfaces when reality has diverged from them.