Blog

What Cursor and Claude Code Actually Change — And What They Don't

9 min read

A well-scoped ticket that would have taken a senior engineer three days now takes four hours with Cursor and Claude Code. That number is not hypothetical — engineering managers who have measured it report similar results across team sizes and tech stacks. The implementation speed increase is real, significant, and continuing to compound as the tools improve. What doesn't change is what the Engineering Manager, Product Owner, or CTO does with that fact.

There is a precise and important distinction between what these tools accelerate and what they leave completely untouched. Engineering managers who internalize it make better adoption decisions. Engineering managers who don't end up with faster shipping — of the wrong things, built more thoroughly than ever before, thoroughly tested against incorrect acceptance criteria, and well-documented by a tool that had no idea the spec was wrong.

What Cursor actually does

Cursor is an AI-first editor built on VS Code. It has inline autocomplete that's substantially more context-aware than GitHub Copilot, a chat interface that can reference specific files, and an agent mode that can plan and execute multi-step tasks while staying inside the editor. For an engineer, it means staying in flow longer — less switching to documentation, fewer dead-end searches, faster scaffolding of repetitive structures. The time-to-first-implementation on a clearly scoped ticket shrinks dramatically. So does the cost of refactoring, because the editor can see the whole codebase context simultaneously.

What Cursor doesn't do is read the Jira ticket the way a product owner wrote it, verify that the implementation matches the intent rather than the literal words, or flag when an engineer has interpreted an acceptance criterion in a way that will surprise the stakeholder who wrote it. Those are human judgment tasks. Cursor accelerates the work that follows the judgment, not the judgment itself.

What Claude Code actually does

Claude Code is different in kind, not just degree. It operates in the terminal, not the editor, and it takes an agentic approach: given a goal, it will plan a sequence of steps, read the relevant files, write code across multiple services, run tests, handle errors, and iterate. It's closer to a junior engineer who can execute a spec without supervision than to an autocomplete tool. Engineering managers are using it for substantial, multi-file work — refactors, feature implementations, test suite generation — tasks that previously required sustained human attention across multiple sessions.

The output is high-quality in the technical sense: it follows existing patterns, handles edge cases, and writes tests. What it cannot do is verify that the goal it was given was the right goal. If the ticket was underspecified, Claude Code will implement something coherent and well-executed that satisfies the letter of the spec and misses the spirit. It will do so thoroughly.

The implementation layer vs. the verification layer

What Cursor and Claude Code change vs. what stays the same

What Cursor and Claude Code actually change vs. what stays the same:

  WHAT CHANGES
  -> Time to first implementation (hours instead of days for well-scoped tickets)
  -> Boilerplate and scaffolding speed (near-instant)
  -> Multi-file refactors (Claude Code can plan and execute across services)
  -> Test generation speed (significantly faster for happy-path coverage)
  -> Developer context-switching cost (Cursor keeps context in the editor)

  WHAT STAYS THE SAME
  -> Whether the acceptance criteria were correct before work started
  -> Whether the implementation matches what the product owner specified
  -> Whether the product owner can recognize what was built at the sprint demo
  -> Whether the right thing was built (not just whether the thing was built right)
  -> Stakeholder visibility into what is in progress vs. shipped
  -> The verification step before a ticket moves to "done"
  -> Customer-facing quality bar and QA requirements

The list on the left side of that breakdown is genuinely impressive — and Engineering Managers are right to be excited about it. But the right side of the list is where most product and delivery risk actually lives. Whether the acceptance criteria were correct before work started is a product problem. Whether the implementation matches what the product owner intended is a verification problem. Whether stakeholders have visibility into what's in flight is a communication and tooling problem. None of those problems get faster to solve when the code gets faster to write.

In fact, they get harder. When a ticket closes in four hours instead of three days, the verification window is proportionally compressed. The sprint demo still happens every two weeks. The backlog still gets reviewed on whatever cadence the team has established. The planning and verification layer has to move with the execution layer — and most organizations haven't made that adjustment yet.

The "faster wrong" problem

The faster wrong compounding math — what happens when a missed requirement ships at AI speed

The "faster wrong" compounding math — what happens when a missed requirement ships at AI speed:

  Human-speed baseline (before AI tools)
  -> Engineer implements misread acceptance criteria: 3 days of work
  -> Discovery at sprint demo: 1 day delay before rework starts
  -> Rework and re-review: 2 days
  -> Total cost of one missed requirement: ~6 days

  AI-speed scenario (Cursor + Claude Code)
  -> Engineer + agent implements misread acceptance criteria: 4 hours
  -> Implementation is thorough — tests written, edge cases handled
  -> Two more related tickets reference and build on this work in same sprint
  -> Discovery at sprint demo: 1 day delay (same as before — humans still review)
  -> Rework now touches three interconnected implementations: 4-5 days
  -> Total cost of same missed requirement: ~6-7 days of rework on 4 hours of original work

  The pattern
  -> Speed multiplier applies to implementation, not discovery
  -> Discovery latency stays constant (sprint demo cadence hasn't changed)
  -> Wrong things get built more thoroughly before anyone notices
  -> Compounding is the real risk — agents build on each other's output

This is the dynamic that most adoption conversations miss. The speed multiplier is real — but it applies asymmetrically. Implementation accelerates by 3-5x. Discovery of misalignment stays fixed to the sprint demo cadence. Rework now operates on a more deeply interconnected codebase because the agent had time to build more thoroughly. The cost of building the wrong thing doesn't decrease with AI adoption. Under some conditions it increases, because the wrong thing is built further and more completely before anyone checks.

For a VP Engineering or CTO who is evaluating or has already adopted these tools, this should change how they think about the verification side of the delivery system — not just the generation side. The velocity gains are real. So is the need to close the verification loop faster.

What Engineering Managers get right and wrong about the tradeoff

Engineering Managers who see Cursor and Claude Code clearly understand that the tools change the implementation layer, not the requirements layer. They adjust sprint planning to account for higher throughput. They watch for scope expansion from agents building beyond what was specified. They keep an eye on whether tests are testing the right behavior, not just producing passing coverage.

What they often don't control is the product side of the verification loop. That's a product owner problem — and it's getting harder, not easier. When adoption doesn't translate into business outcomes, the missing piece is usually not more generation speed. It's a verification layer that kept up with generation speed. Product owners accepting tickets at sprint demos they can't actually evaluate, against acceptance criteria they wrote weeks ago for work that shipped in hours, against a codebase they can't read — that gap compounds every sprint.

Closing the verification gap

The verification layer that needs to keep up with AI-speed implementation is a product and non-technical stakeholder problem. Engineering managers can't solve it unilaterally. Scrum Masters can't solve it with more ceremonies. The solution is giving product owners the ability to see what was built — in plain language, before the sprint demo, without needing a GitHub login or a developer to explain what the diff says.

This is where Kognita enters. A product owner who can ask "does this week's implementation of PROJ-441 match the acceptance criteria I wrote?" on a Wednesday — before the sprint demo, before a customer sees it — is operating with a fundamentally different verification capability. They're not waiting to find out. They're checking on their own cadence, in their own language, against their own specification.

What a product owner can verify about what Claude Code built this sprint

What a product owner can verify about what Claude Code built this sprint — using plain-language codebase queries:

  "What did the checkout service implementation add this sprint?"
  -> Returns plain-language summary of changes — without reading a PR

  "Does the implementation match the acceptance criteria in PROJ-441?"
  -> Cross-references Jira ticket spec against what actually shipped

  "What tests were added for the new payment retry logic?"
  -> Surfaces test coverage before the sprint demo, not after QA finds a gap

  "Were any other services touched when implementing PROJ-441?"
  -> Flags scope expansion from agent-driven implementation — visible before acceptance

  "What changed in the user authentication flow this sprint?"
  -> Ownership verification for product owners who need to accept the work

Kognita's plain-language query layer over the codebase and Jira lets product owners ask verification questions that previously required a developer to answer. "Were any other services touched when implementing PROJ-441?" is the question that catches scope expansion before acceptance. "Does the implementation match the acceptance criteria in the ticket?" is the question that catches misalignment before the sprint demo becomes a negotiation. These are not technical questions. They're product ownership questions — and they should be answerable by the product owner, at AI speed, not at sprint-demo speed.

Final take

Cursor and Claude Code are genuinely powerful tools that change implementation speed in ways that are real, measurable, and continuing to improve. Engineering Managers who adopt them thoughtfully are right to be enthusiastic. The tools do what they claim to do — they make the implementation layer dramatically faster.

They do not change what acceptance criteria say, whether the product owner will recognize what was built, whether the right thing was prioritized, or whether stakeholders have visibility into what's in flight. Those are the problems that determine whether faster shipping produces better outcomes or faster accumulation of the wrong things. Every role has to adapt to the new execution speed — including the roles that own verification.

Cursor and Claude Code triple implementation speed. They do not triple the accuracy of requirements, the clarity of acceptance criteria, or the visibility non-technical stakeholders have into what shipped. The engineering manager who closes both loops wins. The one who only closes the generation loop ships faster into the same blind spots.