Blog

Why Sprint Demos Keep Falling Apart

9 min read

The demo works. The feature does what the ticket said it would do. The acceptance criteria are met. And then the first stakeholder question comes in and engineering looks sideways at each other, and someone opens a laptop, and the prepared narrative starts to unravel.

This is not a rare failure. It is the default sprint demo experience. The surface — what was built this sprint — was ready. The system context behind it was not. Scrum masters running sprint reviews know this pattern well: you can show what was built, but you cannot reliably answer how it connects to everything that was already live.

The sprint demo failure pattern: the feature works, the questions don't

Sprint reviews fail in a specific sequence. The team demos the feature. It works. Stakeholders see it working. Then the questions start — and the questions are not about whether the feature works. They are about the system around the feature.

"Does this handle the case where the customer has both the legacy billing flag and the new flag enabled?" "Is this live in the EU region yet?" "Will this affect the scheduled export that finance runs every Monday?" "What happens with accounts that have more than 100,000 records?"

Engineering built the feature. They know it meets the acceptance criteria. They do not hold the full system state in their heads — and they should not be expected to. But in the sprint review room, every unanswered question reads as unpreparedness. The follow-ups pile up. Someone says "we'll get back to you on that." Stakeholder confidence in the delivery erodes, even though the feature itself is complete and correct.

Sprint 14 bulk export demo — feature ready, questions not

Sprint 14 demo: bulk export feature for enterprise accounts

  What was shown:
  - User selects date range and account scope
  - Clicks "Export"
  - CSV downloads with order history and line items

  Follow-up questions from stakeholders:

  Q1: "Does this handle accounts with both the legacy_billing flag
       and the new_export_v2 flag enabled simultaneously?"
  Engineering: [looks at each other] "We'd have to check the flag logic."

  Q2: "Is this deployed to the EU region yet? Our Frankfurt customers
       need this before Q3."
  Engineering: [opens laptop] "Let me pull up the deployment pipeline..."

  Q3: "Will this change how the existing scheduled exports work?
       Finance runs those every Monday morning."
  Engineering: "The scheduled export is a separate service... probably
       not affected, but we'd want to confirm."

  Q4: "What happens if an account has more than 100,000 line items?
       We have three customers in that range."
  Engineering: "We tested up to 50k. We'd need to run a load test."

  Q5: "Does this respect the data residency settings for EU accounts?"
  Engineering: "Data residency is handled at the infrastructure level...
       we'd need to check with the platform team."

  Feature: shipped.
  Follow-up questions: 3 of 5 required follow-up investigation.
  Stakeholder confidence: low.

Why sprint reviews require more context than what was built this sprint

The Scrum Guide frames the sprint review as an opportunity to inspect the increment and adapt the product backlog. In practice, stakeholders treat it as the moment they get to ask all the questions that accumulated while the sprint was running. Those questions span far more than the current sprint's scope.

A product owner attending the bulk export demo is not thinking about this sprint in isolation. They are thinking about the enterprise customers who asked for this feature, the EU compliance requirements that are ongoing, the finance team that depends on the existing export pipeline, and the three customers who have data volumes that nobody tested against. Their questions are reasonable. They are the questions a person responsible for the product should be asking. The failure is not the questions — it is that the answers require context that was never assembled.

This is a context problem, not an execution problem. The team executed. The sprint worked. The feature shipped. The gap is between what was built and what stakeholders need to know about how it fits into the live system.

The three questions that break every sprint demo

Sprint demo questions cluster into three categories, and all three require system-level knowledge that is not captured in the ticket or the acceptance criteria.

Integration questions ask how the new feature interacts with existing functionality. Which feature flags affect the new code path? Which other services write to or read from the data the new feature touches? What background jobs run against the same tables? These questions require tracing the new code through the existing system, across service boundaries, through event consumers and scheduled jobs that are not mentioned anywhere in the sprint ticket.

Edge case questions ask what happens at the boundaries the team did not explicitly test. What is the behavior with the largest accounts? What happens when two configuration flags are active simultaneously? What is the fallback behavior when a dependency is unavailable? Engineering tested the happy path and the explicitly-specified acceptance criteria. Stakeholders ask about the edge cases that did not make it into the ticket.

Prior sprint interaction questions ask whether this sprint's work changed anything that was shipped in a previous sprint. Does this affect the scheduled export that was built three sprints ago? Does this change the behavior of the reporting pipeline that went live last quarter? Does this touch any data that the finance dashboard reads? These questions require understanding the cumulative state of the system, not just the delta from this sprint.

Why engineering can't always answer in the room

The sprint review puts engineers in an uncomfortable position. They are being asked to give authoritative answers about system state that they do not hold in working memory. An engineer who built the bulk export feature knows the export logic. They do not necessarily know the EU deployment status of every dependent service, the current flag combinations active for enterprise accounts, or the data volume profile of the largest accounts.

This is not a knowledge failure. It is a system complexity reality. Modern software systems are too large for any individual to hold the full state in their head. An engineer who owns the export service does not own the data residency enforcement layer, the feature flag evaluation logic, or the deployment pipeline configuration for EU regions. Asking them to answer questions about those systems in the sprint review room is asking them to go beyond what they built this sprint.

Scrum masters and product owners face a version of the same problem from the other direction, as explored in more depth in the context of how scrum masters can understand what actually got shipped. The scrum master running the review is responsible for the ceremony. They can facilitate the discussion. They cannot independently answer whether the bulk export respects EU data residency — that requires system knowledge they do not have direct access to.

What teams prepare vs. what sprint reviews actually require

Sprint review preparation — what teams cover vs. what reviews require:

  What teams typically prepare:
  ✓ A working demo of the feature
  ✓ Acceptance criteria checklist (AC1, AC2, AC3 — all green)
  ✓ Test coverage summary
  ✓ Screenshots or recording as backup

  What sprint reviews actually require:

  Integration state:
  - Which feature flags interact with the new functionality?
  - Which prior-sprint features share the same code path?
  - What downstream services consume data that this feature modifies?

  Edge case coverage:
  - What are the data volume limits of the new feature?
  - What happens at account-tier boundaries (free vs. paid vs. enterprise)?
  - What is the behavior when two conflicting settings are active?

  Deployment and rollout state:
  - Which environments is this live in right now?
  - Is there a regional rollout in progress?
  - What is the rollback plan and how long would it take?

  Prior sprint interaction:
  - Does this change the behavior of anything shipped in the last 3 sprints?
  - Are there open bugs that overlap with this feature's scope?
  - Does this touch any data that the reporting pipeline consumes?

  Teams prepare the first list. Stakeholders ask from the second list.

What sprint review preparation with system context looks like

The sprint review does not have to be a live investigation session. The questions are predictable. Integration questions, edge case questions, and prior sprint interaction questions follow a consistent pattern across every sprint. What changes is the specific feature and the specific system context around it. If that context is assembled before the review starts, the answers are ready before the questions arrive.

Effective sprint review preparation means knowing, for each ticket being demoed: which feature flags interact with the new code path; which downstream services consume the data it produces; which prior-sprint features share the same execution path; what the deployment status is across all relevant regions; and what the known limits of the new functionality are. This information exists in the codebase and in Jira. The problem is that it is scattered across dozens of files, services, tickets, and deployment configs that nobody assembles together until a stakeholder asks.

The integration between Jira tickets and codebase state is the piece that makes this preparation tractable. A Jira ticket links to the code changes. The code changes trace to the services, flags, and event consumers that interact with the new functionality. That chain — from ticket to code to system context — can be surfaced before the review, rather than discovered during it. This is what Jira and AI coding tools needing a shared context layer comes down to in practice: the ticket and the codebase need to be queryable together, not separately.

Kognita + Jira sprint review preparation — what gets surfaced

Kognita + Jira sprint review preparation — what gets surfaced:

  Sprint 14 bulk export ticket: PROJ-1847

  Codebase connections Kognita surfaces:
  ----------------------------------------
  Feature flags that interact with bulk export:
    - legacy_billing (affects invoice line item format in export output)
    - new_export_v2 (routes through ExportServiceV2 instead of LegacyExporter)
    - data_residency_eu (active for 23 accounts — export must check this flag
      before writing to S3 bucket; currently not handled in ExportServiceV2)
  → Answers Q1 and Q5 before the demo starts.

  Services that share the export code path:
    - ScheduledExportJob (runs weekly, calls ExportService directly)
      last modified: Sprint 11 — added retry logic
    - ReportingPipeline subscriber on export.completed event
      (consuming export artifacts for finance dashboards)
  → Answers Q3 before the demo starts.

  Deployment state (from Jira + CI integration):
    - Deployed: us-east-1, us-west-2
    - Not deployed: eu-west-1, ap-southeast-1
    - EU deployment blocked by: PROJ-1901 (data residency review — open)
  → Answers Q2 before the demo starts.

  Load and scale context:
    - ExportServiceV2 has no pagination on line item query
    - Largest account in prod: 87,000 line items (account_id: 4821)
    - No load test run above 50,000 line items
  → Answers Q4 before the demo starts.

Kognita connects Jira tickets to the live codebase index so sprint review preparation produces the system context that stakeholders will ask about. For the bulk export ticket, that means surfacing the feature flag interactions, the downstream service dependencies, the deployment status across regions, and the scale limitations — before the demo, not during it. The scrum master running the review walks in knowing the answers to the predictable questions. Engineering does not have to be put on the spot. Stakeholders get answers in the room instead of a follow-up email three days later.

Final take

Sprint demos do not fail because features are broken. They fail because the system context around features is never assembled until a stakeholder asks a question that requires it. The feature works. The questions about how it fits into the live system — which flags interact with it, where it is deployed, what it affects that was shipped in prior sprints — do not have ready answers.

This is a context preparation problem, and it is solvable before the room fills up. The information exists: in the codebase, in Jira, in the deployment pipeline. The gap is that it is never connected to the sprint ticket until someone needs it. Sprint review preparation that surfaces integration state, flag interactions, deployment status, and prior-sprint dependencies changes the dynamic. Engineering does not have to improvise answers to system questions. Product owners and scrum masters walk in prepared. The demo becomes a demonstration, not a live investigation.