Ir para o conteúdo principal

Article 2 min read

Zendesk AI + GPT-5: Setting the pace for the next generation of support

Shashi Upadhyay

President, Products, Engineering and AI at Zendesk

Última atualização em August 7, 2025

At Zendesk, our platform helps businesses deliver fast, accurate resolutions with less effort. And at the core of our platform is having a system of AI agents working to understand what a customer needs in that specific moment where trust is on the line, act on it correctly, and know when to escalate or step aside.

This is exactly why foundation models are essential and why it’s critical for a company like Zendesk to lead in testing and using the latest models to continuously enhance and accelerate our most advanced AI capabilities.

With the release of GPT-5, we saw a meaningful opportunity to improve key parts of that system. Today, GPT-5 is live in production inside Zendesk’s Resolution Platform, powering real customer conversations across our agent assist and automation workflows.

“Iterative deployment helps ensure we approach and launch new model capabilities with the highest levels of rigor. Working with Zendesk on GPT-5 is the latest example of how early testing and feedback help us identify where the API can drive the most meaningful impact where it matters most for their users.”
– Olivier Godement, Head of Business Products at Open AI

Why we test every model the same way

When we evaluate a new model, we are not looking for benchmark wins. We are asking whether it improves resolution outcomes in the field. Zendesk runs a rigorous benchmarking program to evaluate and tune new models like GPT-5 across key tasks, balancing latency, cost, and performance. This enables rapid testing and rollout of models in under 24 hours.

Our evaluation framework covers:

  • Precision: Can the model return accurate, complete answers grounded in trusted knowledge sources such as help center articles?
  • Automated resolution: Does it increase the percentage of issues auto-resolved without human touch?
  • Execution: Can it follow structured workflows with high fidelity?
  • Latency: Is the response fast enough for live support environments?
  • Safety: Does it avoid hallucination and only take actions when confident?

We continuously monitor performance with offline and live metrics, ensuring transparent and reliable AI improvements. GPT-5 delivered improvements across almost every one of these criteria. Here are the results.

What GPT-5 improved in our AI Agents, Copilot, and App Builder

  1. Fewer fallback escalations – reduced by over 20%

    GPT-5 delivered more complete responses with fewer missed details, reducing agent handoffs and helping customers get answers more quickly.

  2. Sharper handling of ambiguity – improvement in intent clarification

    GPT-5 clarified vague customer input more effectively, enabling better routing and increasing coverage of automated flows in over 65% of conversations.

  3. High execution reliability – 95%+ on standard procedures, 30% reduction in failure on large flows

    GPT-5 maintained structure across long workflows and adapted to real-world service complexity without losing context.

  4. Higher quality assist – 5 point lift in agent suggestion accuracy across four languages

    Agent productivity increased as GPT-5 suggestions became more concise, contextually relevant, and aligned with tone guidelines.

  5. Faster app generation – 3 to 4 times more prompt iterations per minute, better alignment in code generation for app builder

    GPT-5 was 25–30% faster overall and enabled more prompt iterations per minute, speeding up app builder development workflows.

Technical integration: How we use GPT-5 at Zendesk

GPT-5 is not simply swapped in as a replacement for earlier models. It is one component of a larger, modular AI architecture built to deliver resolutions reliably.

Model selection and use

We use GPT-5 selectively across use cases where it demonstrates measurable value. These include:

  • Intent clarification and disambiguation
  • Long-context answer generation
  • Procedure compilation and execution (PCA / PEA)
  • Agent reply generation in auto-assist scenarios

GPT-5 operates in conjunction with our intent classification and reasoning pipeline. We map vague input to clear actions, then use the model to synthesize responses or execute multi-step workflows where appropriate.

Reasoning modes and flow handling

GPT-5 allows for medium reasoning with significantly longer context windows. This is especially helpful for:

  • Multi-turn conversations
  • Step-by-step execution of internal procedures
  • Dynamic generation of structured outputs from loosely worded inputs

In these scenarios, we prioritize maintaining conversational structure, accuracy, and context window efficiency. GPT-5 performs reliably even with higher token loads, which enables smoother automation of service interactions that span multiple turns or inputs.

Scaffolding and control

To safely deploy GPT-5 in production, we surround it with strong operational guardrails:

  • Intent-layer pre-routing to reduce risk and improve clarity
  • Real-time observability with structured logging of model behavior
  • Trigger-level governance to prevent out-of-policy responses
  • Fallback protocols that default to safe escalation or agent involvement

We treat the model as a nondeterministic tool within a controlled system—not a standalone decision-maker. That is what enables us to deploy it in enterprise-grade environments.

What this means for our customers

We integrated GPT-5 because it allowed us to resolve more issues faster and with higher reliability.

It helped reduce fallback escalations. It improved performance across multilingual support. It executed workflows with precision and handled ambiguity more gracefully. And it helped our internal teams move faster in the build-test-deploy cycle for AI-powered agents.

Every step forward with AI should result in fewer dropped threads, shorter resolution times, and a better experience for the people on both sides of the conversation. GPT-5 helps us do that.

We will keep testing, tuning, and integrating new models as they evolve. But our focus will stay the same: using AI to deliver resolutions that customers can trust.

Shashi Upadhyay

President, Products, Engineering and AI at Zendesk

Shashi Upadhyay is Zendesk’s President of Product, Engineering, and AI, responsible for developing innovative products that leverage advanced AI. With a proven track record of creating transformative solutions, he combines a deep understanding of AI's potential for business transformation with a strong commitment to customer-centric design. Before Zendesk, Shashi held a key role at Google, where he led the advertiser product portfolio and spearheaded innovation as the head of Google Ads, Google Analytics, DV3, SA3, and Performance Max, one of Google’s fastest-growing products. Prior to Google, he founded Lattice Engines, which was acquired by Dun & Bradstreet (D&B) in 2019. He played an instrumental role in D&B's public offering in 2020 and has since become an active investor in startups across diverse sectors, including energy storage, neuroscience, and enterprise infrastructure. Shashi earned his undergraduate degree in Physics from the Indian Institute of Technology (IIT) Kanpur and went on to obtain a Ph.D. in Physics from Cornell University.

share_the_story

Histórias relacionadas

Article
5 min read

Enter your resolution era with Zendesk’s agentic AI

AI has completely changed what customers expect from support. When your smart home assistant can tell…

Article
4 min read

Meet the all-new Zendesk Employee Service Suite

The way we work is changing faster than ever. Employees are navigating hybrid and remote work…

Article
5 min read

Meet the Zendesk Resolution Platform: Powered by our network of AI Agents

AI is rapidly transforming customer expectations, and businesses are racing to keep up. But in service,…

Article
7 min read

AI innovation checklist: How leading companies have stayed ahead in 2025

2024 has been a year like no other. AI has moved from buzzword to business-critical, and…