VoiceBridge

Measuring and Optimizing Latency in AI Voice Calls

MYLINEHUB Team • 2026-02-09 • 12 min

A practical latency guide: where delay comes from in duplex AI voice, what to measure, and optimizations that preserve natural conversation.

Measuring and Optimizing Latency in AI Voice Calls

Measuring and Optimizing Latency in AI Voice Calls (Asterisk + VoiceBridge)

In AI voice systems, latency is everything. A delay of 200–300ms feels natural. A delay of 800ms feels robotic. Beyond 1.2 seconds, conversations collapse into awkward turn-taking.

When connecting Asterisk / FreePBX to AI through MYLINEHUB VoiceBridge, latency is not a single number — it is the sum of multiple micro-delays across:

  • RTP packetization and jitter buffering
  • ARI event handling
  • Audio encoding/decoding
  • Network travel time
  • AI STT processing
  • LLM inference
  • TTS synthesis
  • RTP re-injection timing

This article explains how to measure latency correctly, where it originates, and how VoiceBridge architecture minimizes it.

Architecture reference: MYLINEHUB VoiceBridge Architecture

Open-source project: mylinehub-voicebridge (GitHub)

Understanding End-to-End Voice Latency

End-to-end AI voice latency can be visualized as:

Caller Speech Asterisk RTP Jitter Buffer VoiceBridge RTP → PCM STT → LLM → TTS PCM → RTP Bot Response

Total latency = network + buffering + AI inference + audio regeneration.

Latency Sources in Detail

1. RTP Frame Duration

Telephony RTP typically uses 20ms frames (G.711). Larger frames increase latency.

Packet handling in: rtp/RtpPacketizer.java

2. Jitter Buffer Delay

Asterisk jitter buffer can add 40–120ms depending on configuration.

3. STT Processing Delay

Streaming STT reduces delay versus batch transcription. VoiceBridge is designed for streaming audio input.

4. LLM Inference Time

Response time depends on token generation speed. Use smaller models for ultra-low-latency use cases.

5. TTS Generation

Streaming TTS significantly reduces playback wait time.

How VoiceBridge Minimizes Latency

Symmetric RTP Endpoint

Implemented in: rtp/RtpSymmetricEndpoint.java

Eliminates NAT-induced delay and retransmission attempts.

Efficient RTP Packetizer

rtp/RtpPacketizer.java ensures:

  • Monotonic timestamps
  • Consistent 20ms pacing
  • Minimal buffering

Direct ARI Event Handling

ARI control layer: ari/impl/AriBridgeImpl.java

Reduces bridge creation delay and media negotiation overhead.

Measuring Latency Correctly

Method 1 — Waveform Echo Test

Play a click tone and measure time until AI response is heard.

Method 2 — RTP Timestamp Comparison

Use Wireshark to:

  • Capture inbound RTP
  • Capture outbound RTP
  • Compare timestamp delta

Method 3 — Application Logging

Add timing logs around:

  • STT request start
  • LLM completion time
  • TTS generation complete

Latency Targets for Natural Conversation

Total Delay Conversation Quality
< 300ms Feels real-time
300–600ms Acceptable
600–1000ms Noticeable delay
> 1000ms Breaks natural flow

Optimization Checklist

  • Use 20ms RTP frames
  • Keep VoiceBridge close to Asterisk (same LAN if possible)
  • Use streaming STT + streaming TTS
  • Minimize model size for latency-critical use cases
  • Disable unnecessary logging in production
  • Avoid unnecessary transcoding

Cloud vs On-Prem Latency

Hosting VoiceBridge on-prem:

  • Reduces RTP travel time
  • Improves jitter stability

Cloud deployment:

  • Adds network round-trip
  • Requires careful region selection

Conclusion

AI voice quality is determined less by “model intelligence” and more by media engineering discipline.

VoiceBridge was built with RTP correctness and duplex timing control as first-class design principles — not afterthoughts.

Next recommended reading:

Try it

Want to see API-driven CRM + Telecom workflows in action? Try the WhatsApp bot or explore the demos.

💬 Try WhatsApp Bot ▶️ Watch CRM YouTube Demos
Tip: Comment “Try the bot” on our YouTube videos to see automation in action.
M
MYLINEHUB Team
Published: 2026-02-09
Quick feedback
Was this helpful? (Yes 0 • No 0)
Reaction

Comments (0)

Be the first to comment.