Voice Agent Index
Voice agent interruption test visual card for LiveKit Turn Detector v1.0 news.
Direct answer: LiveKit released Turn Detector v1.0 on June 17, 2026. The company says the model listens directly to speech, combines semantic and acoustic cues, runs as the default in LiveKit Cloud agents, and achieved the best results among models it evaluated across English and 13 other languages. Buyers should treat the release as a reason to test interruption handling and response timing with real call scripts, not just polished demos.

What happened

  • LiveKit announced Turn Detector v1.0, a model for deciding when a voice-agent user has finished speaking.
  • The company says v1 listens to speech directly instead of waiting for transcript-only signals, combining semantic and acoustic understanding into one end-of-turn prediction.
  • LiveKit says the model is now the default for agents running on LiveKit Cloud, while v1-mini is an open-weight model optimized for CPU inference.
  • LiveKit also published eot-bench, an open benchmark suite for end-of-turn detection, because the company says established public benchmarks for this voice-agent problem have been limited.

Why this is trending

  • Voice agents that interrupt callers feel broken even when the underlying model is strong.
  • Latency tradeoffs are now a buyer issue: waiting too long creates dead air, but answering too early talks over the customer.
  • End-of-turn detection affects restaurant ordering, appointment booking, dispatch intake, legal intake, healthcare scheduling, and every phone workflow where callers pause mid-thought.

The Voice Agent Index take

The practical takeaway is not to assume every LiveKit-based agent is automatically production-ready. Buyers should demand turn-taking evidence from every vendor: pause handling, interruption behavior, latency, noisy calls, multilingual callers, transcript timing, and what happens when a human needs to take over.

Voice Agent Interruption Test Packet

A buyer test packet for checking end-of-turn detection, pauses, interruptions, backchannels, latency, multilingual callers, and human handoff timing.

Proof item Why it matters Buyer ask
Pause script A caller may pause while thinking, checking an address, or deciding between options. Run calls where the user pauses mid-sentence and confirm the agent does not jump in too early.
Backchannel script Short sounds like yes, okay, or mm-hmm can be acknowledgments rather than new requests. Test whether the agent keeps listening instead of resetting the call or interrupting.
Latency budget A response can be accurate but still feel slow if the endpointing delay is too high. Measure end-of-user-speech to first-agent-audio timing across at least 20 realistic calls.
Noisy caller scenario Background audio, speakerphone, kitchens, vehicles, and job sites can distort turn detection. Include noise and overlapping speech in the acceptance test, not only clean microphone demos.
Multilingual or accented caller sample Turn-taking errors often appear when speech patterns differ from the demo speaker. Ask for the languages, accents, and caller types that were actually tested.
Human handoff timing The agent must yield cleanly when the caller rejects automation or needs a person. Verify when the agent stops talking, what context transfers, and how quickly the human can respond.

What buyers should do next

  1. Write a turn-taking test script before scheduling a voice AI demo.
  2. Include pauses, interruptions, backchannels, noise, repeated details, and a human transfer.
  3. Record response timing, false interruptions, dead-air delays, transcript quality, and transfer context.
  4. Compare vendors by completed workflow quality, not only by how natural the first demo sounds.

Turn this brief into a vendor packet

Make the vendor prove the workflow before the demo gets polished.

Use the RFP generator and call-test script to turn this news framework into concrete evidence requests, acceptance tests, and escalation rules for your own voice AI rollout.

Buyer FAQs

What did LiveKit release?

LiveKit released Turn Detector v1.0, a voice-agent end-of-turn model that listens directly to speech and combines semantic and acoustic cues to decide when the user has finished speaking.

Why does end-of-turn detection matter?

It controls whether the agent talks over the caller, waits through natural pauses, or leaves awkward silence. Bad turn-taking can make an otherwise strong voice AI unusable.

Should buyers accept the claim that turn detection is solved?

Buyers should verify it in their own workflow. Run pause, interruption, noise, multilingual, and transfer tests before treating any benchmark or vendor claim as production proof.

What should be in a voice agent interruption test packet?

Include scripts for pauses, backchannels, caller interruptions, noisy backgrounds, accented or multilingual callers, repeated details, tool calls, and human handoff timing.

Sources

  • LiveKit announcement: Primary source for Turn Detector v1.0, release date, model design, default availability, v1-mini, and benchmark framing.
  • LiveKit turns documentation: Technical documentation for turn detection, interruption handling, endpointing delay, VAD, realtime models, and configuration options.
  • LiveKit turn detector model page: Open model context for LiveKit's turn detector family, supported languages, benchmarks, usage, limitations, and deployment requirements.