Blog

Your AI Deflection Rate Is Up. Your Resolution SLA Is Still Red. Here's Why.

9 min read

The AI deflection rate goes to 40%, then 50%, then 60%. The quarterly business review slide looks good. Someone asks about resolution SLA and the room goes quiet. It's still red. It might be worse than before the AI deflection rollout. Both things are true simultaneously: AI is deflecting more tickets than ever, and customers with real technical issues are taking longer to get a resolution.

Deflection and resolution are not the same metric. Optimizing one does not improve the other. In fact, optimizing deflection at the expense of resolution quality can make resolution SLA worse — by handling tickets with AI responses that delay correct escalation, by reducing the signal that helps engineers identify recurring issues, and by creating a loop where unresolved issues generate multiple "deflected" tickets.

What deflection actually measures

Deflection counts tickets that received an automated response and were not manually escalated. The implicit assumption is that a deflected ticket is a resolved ticket — the customer got an answer and didn't need human intervention. This assumption holds for FAQ-type questions: password reset, account navigation, billing inquiry, feature explanation.

It breaks for technical issues, where the AI response is based on training patterns rather than your specific system. A customer experiencing a bug gets an AI response suggesting standard troubleshooting steps. Some percentage of them follow the steps, the steps don't work, and they don't immediately reopen the ticket — they try workarounds, escalate internally, call a different channel, or just wait. The ticket shows as deflected. The issue is unresolved.

What deflection measures and what it misses

What "deflection" measures:
  -> Tickets that got an AI response and were not escalated
  -> Percentage of total tickets that closed without human handling
  -> Often presented as "resolved by AI"

What deflection doesn't measure:
  -> Whether the customer's issue was actually fixed
  -> Whether they reopened the ticket or created a new one
  -> Whether they called support instead
  -> Whether they churned silently

The deflection optimization trap

AI support vendors optimize for deflection because deflection is measurable and improves quickly. Add more categories to the auto-response rules. Lower the confidence threshold for automated replies. Expand the KB articles that trigger responses. Each of these moves the deflection rate up. None of them improve the quality of technical issue responses.

The trap is that deflection is a proxy metric for "resolved without human effort." As AI deflection expands to technical issues it can't actually resolve, the proxy breaks. You're measuring "tickets that got an AI reply and weren't immediately reopened" — which is a very different thing from "tickets where the customer's issue was fixed."

Resolution SLA captures what actually happened: was the issue fixed, within the contractual window? This metric is harder to move because it requires the issue to actually be fixed. It requires the right team to investigate the right service with the right context. Faster deflection doesn't contribute to that chain.

How deflection and resolution SLA diverge

Why deflection and resolution SLA diverge:
  High deflection + poor resolution:
    -> AI responds quickly, customer follows guidance
    -> Guidance based on training data, misses system cause
    -> Issue persists, customer reopens or submits new ticket
    -> New ticket counts as new deflection opportunity
    -> Deflection rate stays high, issue never resolves
    -> Resolution SLA: continues failing

  Low deflection + good resolution:
    -> AI escalates correctly to right team with context
    -> Engineer has codebase starting point
    -> Root cause found faster
    -> Resolution SLA: improves

Why technical issue deflection hurts resolution SLA

When an AI deflects a technical issue with an incorrect response, the resolution clock doesn't stop. If the customer follows the wrong guidance, tries workarounds, and reopens the ticket three days later, the SLA clock has been running the whole time. The deflection looked like a resolved ticket for three days. The reopened ticket appears as a new SLA event. The original issue's resolution time — measured correctly — is three days plus however long it takes to resolve after reopening.

First-response SLA and resolution SLA are different problems. Deflection is a first-response problem. Resolution is an investigation problem. These don't share a solution.

What actually moves resolution SLA

Resolution SLA improvement comes from reducing two time costs: the time to reach the correct team on the first assignment, and the time the correct team needs to identify the root cause. Both of these are information problems. Both are addressable with codebase context.

A ticket that routes to the correct team immediately has a shorter resolution time than a ticket that routes correctly after two re-routes, even if the actual investigation takes the same time. A ticket that arrives at the correct team with recent change history attached enables faster root cause identification than one that arrives with only the customer description.

Neither of these improvements comes from deflection optimization. They come from routing accuracy and context enrichment at the escalation point.

What actually improves resolution SLA

What resolution SLA improvement requires:
  -> Correct team receives ticket on first assignment
  -> Receiving engineer has system context, not just ticket text
  -> Recent change history surfaced for relevant service
  -> No re-routes, no context reconstruction from scratch

  How Kognita contributes:
  -> Resolves service ownership from codebase at triage time
  -> Surfaces recent changes in impacted service
  -> Correct first-touch routing via Jira webhook enrichment
  -> Engineer starts investigation at the right place

Kognita's role in resolution, not deflection

Kognita doesn't optimize for deflection. It optimizes for correct routing and useful context at the point of escalation. When a Jira ticket fires a webhook, Kognita resolves service ownership from the live codebase, surfaces recent change history, and ensures the ticket reaches the team responsible for the impacted service on the first assignment.

The impact is directly on resolution SLA: first-touch routing accuracy improves, re-route cycles decrease, engineer context at investigation start improves. None of these show up in deflection metrics, because deflection counts tickets that don't reach engineering. Kognita's contribution is to the tickets that do — making sure those tickets go to the right place with the right context.

The combination that works: AI deflection for FAQ-category tickets (where training data is sufficient) plus Kognita-grounded routing for technical issues (where system context is required). Deflection handles the volume. Kognita handles the accuracy. Resolution SLA improves because the tickets that need engineering investigation reach the right engineer with the right context.

How to read your metrics honestly

If deflection rate is up and resolution SLA is unchanged or worse, the deflection metric is measuring the wrong thing or the AI is deflecting tickets it shouldn't. Check the reopen rate on deflected tickets — if tickets that were marked deflected are reopening at higher rates than before, the AI is sending customers down the wrong path.

The metric that tells you your AI support is actually working is not deflection rate. It's the combination of: deflection rate on non-technical tickets (should be high, those are real deflections) plus resolution SLA on escalated tickets (should be improving, that's where Kognita's routing helps) plus reopen rate on deflected tickets (should be low, that's your proxy for deflection quality).

Final take

A high deflection rate and a red resolution SLA can coexist indefinitely. They're measuring different parts of the same system. Deflection optimizes for tickets that don't reach engineering. Resolution SLA measures whether the ones that do reach engineering get fixed in time. These are not in tension — they just require different tools to improve.

Deflection tools are good at keeping FAQ tickets out of engineering queues. Codebase-grounded routing is what gets the remaining tickets to the right engineer with the right context. Both matter. Conflating them, or assuming deflection improvement transfers to resolution improvement, is what produces the reporting gap: green deflection dashboard, red resolution SLA.

Deflection is an efficiency metric. Resolution SLA is a commitment metric. If your commitments are still red after your efficiency improved, you optimized the wrong thing — and the fix is routing accuracy, not more deflection.