The conversation around AI virtual agents in UK contact centres has shifted from "should we?" to "where do we start?" That is progress. But for many mid-market operations teams, the answer they land on is still the wrong one.
The default ambition is broad: automate as much of the contact centre as possible, reduce headcount dependency, and modernise the customer experience in one programme. It is an understandable position, particularly when vendor demonstrations make deployment look straightforward. In practice, that framing is exactly what causes AI virtual agent projects to stall, overspend, or deliver results that cannot be attributed to anything measurable.
The firms seeing genuine returns are not starting with replacement. They are starting with one narrow, repetitive, high-volume service journey, proving economic value within a defined timeframe, and then earning the right to scale. The sequencing is not a minor implementation detail. It is the difference between a project that builds internal confidence and one that becomes a cautionary tale at the next budget review.
Key takeaways
- Most AI virtual agent programmes underperform because scope is set before the business case is stress-tested
- The strongest first use cases are repetitive, high-volume journeys with clear intent patterns and low regulatory sensitivity
- Cost-to-serve and labour compression are more reliable success metrics than headline automation rates
- A phased rollout reduces delivery risk and creates faster, more credible proof of value than a replacement-first strategy
- The hidden costs of integration, data clean-up, and governance typically exceed licence fees in year one
"Organisations that capture significant value from AI usually start with narrow, high-impact use cases and scale in stages with clear metrics." — McKinsey, State of AI 2025
The ambition to replace large portions of customer service with AI is not irrational. The economics are compelling: cost per interaction for human agents runs at roughly £5 to £6, while AI-handled interactions can cost a fraction of that at scale. But the jump from that observation to a broad replacement programme skips several layers of operational complexity that determine whether those economics ever materialise.
Broad programmes fail for predictable reasons:
Mid-market firms are particularly exposed here. Unlike large enterprises, they cannot absorb a twelve-month experimental cycle with uncertain outcomes. Only 20% of mid-market companies successfully scale AI deployments beyond pilot stage, and 60% of AI projects that attempt rapid scaling fail outright. The firms that do reach scale almost always started narrower than they originally planned.
The strongest first use cases share a common profile: they are repetitive, high-volume, low-ambiguity, and operationally well-understood. The AI virtual agent does not need to be impressive. It needs to be reliable in a bounded context, and it needs to remove enough workload that the impact is measurable within weeks, not quarters.
The journeys that consistently deliver fastest time-to-value are those where intent is predictable, data is accessible, and the consequence of a poor interaction is low. These include:
|
Journey type |
Why it works for AI first |
|---|---|
|
Order status and tracking |
High volume, clear intent, data is usually CRM or OMS-connected |
|
Appointment booking and changes |
Structured workflow, low ambiguity, containable without escalation |
|
Billing and payment queries |
Repetitive intent patterns, self-service resolution is often sufficient |
|
Password resets and account access |
Fully automatable, no regulatory sensitivity, immediate resolution |
|
Simple service requests and FAQs |
High containment potential, knowledge base-driven, easy to measure |
Journeys involving complaints, vulnerable customers, complex financial decisions, or regulatory obligations are not good starting points. Not because AI cannot eventually assist with them, but because the cost of a poor interaction in those contexts is disproportionately high and the governance requirements are significantly more demanding.
The question to ask of any candidate journey is not "can AI handle this?" but "does AI removing this journey eliminate a meaningful step in the operational process?" BCG research on AI agent deployment is clear on this point: the biggest productivity gains come when AI eliminates entire workflow steps, not when it simply reduces wait time or deflects a small proportion of contacts.
A virtual agent that handles 40% of order status queries but still requires agents to manage the remaining 60% through the same underlying process has not changed the operational model. One that handles 80% with clean containment, accurate data retrieval, and a well-designed handover for exceptions has materially compressed the labour requirement for that journey. That distinction determines whether the business case holds.
Automation rate is the metric vendors lead with. It is also the least useful one for building an internal business case. A 70% automation rate sounds strong until you discover that 30% of interactions are still escalating to agents, that the escalation handovers are poorly designed, and that customers are repeating themselves because context is not being passed cleanly. The headline number looks good. The operational reality does not.
Cost per interaction is a useful starting point, and the gap between AI and human handling is significant. According to Freshworks data, cost per interaction drops from approximately £3.60 before AI implementation to around £1.15 after, when deployment is working well. But cost-to-serve is the more complete measure because it captures what happens across the full resolution journey, including escalations, agent rework, repeat contacts, and the human time spent cleaning up interactions the AI did not fully resolve.
A cheap AI interaction that generates a repeat call the following day is not cheap. It has simply moved the cost.
The more useful productivity metric for operations leaders is labour compression: the hours of human work displaced per 1,000 interactions handled by AI. This framing, supported by BCG's work on AI agent economics, shifts the question from "how much did we automate?" to "how much did we change the labour requirement?"
|
Metric |
What it measures |
Why it matters |
|---|---|---|
|
Automation rate |
% of contacts handled without human involvement |
Useful but incomplete without containment data |
|
Cost per interaction |
Direct handling cost for AI vs human |
Misses escalation, rework, and repeat contact costs |
|
Cost-to-serve |
Full cost of resolving a customer need end-to-end |
Most accurate indicator of operational ROI |
|
Labour compression |
Human hours displaced per 1,000 AI interactions |
Best measure of whether AI is changing the operational model |
|
Containment rate |
% of AI interactions fully resolved without escalation |
Leading indicator of cost-to-serve improvement |
The right target for a well-scoped first deployment: containment rates above 70% on the chosen journey, with cost-to-serve reduction visible within three months. Industry benchmarks suggest 63% of customer service AI deployments reach positive ROI in year one, with a median payback period of around four months, but only when the initial use case is tightly scoped and the baseline metrics are documented before go-live.
The licence fee is rarely the biggest cost in an AI virtual agent deployment. For most mid-market organisations, the work that determines whether the project succeeds happens before a single customer interaction is handled, and it is consistently underestimated in vendor proposals.
The costs that do not appear in the demo:
"Teams often undercount invisible costs, which can materially increase total cost-to-serve, especially in regulated workflows." — Teneo.ai, AI vs Live Agent Cost Analysis 2025
Human-in-the-loop design is not a sign that the AI has failed. It is a deliberate and necessary part of any responsible deployment. For complaints, vulnerable customers, and complex service exceptions, the question is not whether a human should be involved but how quickly and cleanly the handover happens.
A phased approach is not a compromise on ambition. It is the most reliable route to sustainable scale, and it is what separates deployments that build internal momentum from those that quietly get deprioritised after the first review cycle.
The critical discipline at each phase: resist the pressure to expand before the current phase is stable. The most common failure mode is not starting too small. It is expanding too quickly before containment, knowledge quality, and integration reliability have been confirmed.
Vendor proposals for AI virtual agents are almost always optimistic on timeline, conservative on integration effort, and vague on how success will be measured after go-live. Before committing budget, operations leaders should be able to answer the following questions clearly, and so should their prospective vendor.
On business case and sequencing:
On integration and data readiness:
On governance and risk:
The right question to anchor the entire evaluation is not "how much can we automate?" It is: where can we reduce cost-to-serve without damaging the customer experience, and how will we know within 90 days whether it is working?
Any vendor who cannot answer that question with specifics is not ready to deploy in a mid-market contact centre environment.
AI virtual agents will reshape how UK contact centres operate. The economic case is not in doubt: Gartner projects a reduction of $80 billion in global agent labour costs by 2026, and the organisations already seeing those returns are not the ones that launched the biggest programmes. They are the ones that started with the clearest use case, measured the right things, and used early proof of value to justify the next phase of investment.
For UK mid-market operations leaders, the competitive risk is not moving too slowly on AI. It is moving too broadly too soon, absorbing the cost and complexity of a large-scale deployment before the operational foundations are in place to make it work.
Start with one journey. Measure cost-to-serve before and after. Prove containment. Then scale.
If you are building the business case for AI virtual agents and want to work through use-case selection, baseline metrics, and rollout sequencing with an independent perspective, book a discovery workshop with the Fortay Connect team.
FAQs
What is the best first use case for AI virtual agents? The best first use case is a repetitive, high-volume journey with clear intent, low ambiguity, and limited regulatory risk. Order status, appointment changes, billing queries, and password resets usually fit best because they offer quick proof of value and measurable containment.
Why do so many AI virtual agent projects fail? They usually fail because the scope is too broad at the start. Teams often try to modernise the whole contact centre at once, which hides the real work in knowledge quality, integrations, escalation design, and governance.
How should operations leaders measure AI virtual agent success? Cost-to-serve and labour compression are stronger measures than headline automation rate. They show whether AI is reducing the full effort required to resolve a customer need, including escalations, rework, and human clean-up.
What should a phased AI virtual agent rollout look like? Start with one journey, prove containment and cost-to-serve reduction, then expand into adjacent intents only when the first phase is stable. After that, connect the virtual agent to broader workflows across CRM, billing, or order management.
How do you choose the right first AI virtual agent journey? Choose a journey with predictable intent, high volume, accessible data, and low emotional or regulatory sensitivity. The right first use case is the one where automation removes a meaningful operational step, not just a few customer contacts.