VoiceBridge

How VoiceBridge Achieves True Full-Duplex Audio in Production

MYLINEHUB Team • 2026-02-19 • 13 min

What it takes to deliver true full-duplex AI voice: dual RTP legs, timing discipline, jitter handling, barge-in control, and safe Asterisk integration.

How VoiceBridge Achieves True Full-Duplex Audio in Production

How VoiceBridge Achieves True Full-Duplex Audio in Production

True full-duplex AI voice means both sides of a conversation can speak at the same time — with real-time interruption detection, stable RTP timing, and no blocking media primitives.

Most Asterisk-based implementations fail at this because they use AGI or file-based playback. VoiceBridge achieves production-grade duplex using ARI + ExternalMedia + disciplined RTP engineering.

Source Repository:
https://github.com/mylinehub/omnichannel-crm/tree/main/mylinehub-voicebridge

Production Definition of Full-Duplex

In production telecom systems, full-duplex requires:

  • Simultaneous inbound and outbound RTP streams
  • No blocking playback operations
  • Immediate barge-in detection (<100ms)
  • Stable RTP cadence (20ms frame pacing)
  • Correct SSRC and sequence discipline
  • Bridge-level orchestration

VoiceBridge implements all of the above at the RTP layer, not just the application layer.

Core Architecture: ARI + Mixing Bridge + ExternalMedia

VoiceBridge does not rely on AGI. Instead it uses ARI to create and manage bridges dynamically.


Caller Channel
      │
      ▼
Mixing Bridge (ARI-controlled)
      │
      ├── ExternalMedia Channel (RTP out → AI)
      └── Caller Audio (RTP in)
  

ARI control is implemented in:

  • AriBridgeImpl.java
    src/main/java/com/mylinehub/voicebridge/ari/impl/AriBridgeImpl.java
  • ExternalMediaManagerImpl.java
    src/main/java/com/mylinehub/voicebridge/ari/impl/ExternalMediaManagerImpl.java

These components dynamically:

  • Create mixing bridges
  • Attach caller channel
  • Attach ExternalMedia RTP endpoint
  • Manage lifecycle events

RTP Discipline: The Foundation of Duplex Stability

Full-duplex is impossible without strict RTP engineering. VoiceBridge handles RTP generation internally instead of using file playback.

Core RTP components:

  • RtpPacketizer.java
    src/main/java/com/mylinehub/voicebridge/rtp/RtpPacketizer.java
  • RtpSymmetricEndpoint.java
    src/main/java/com/mylinehub/voicebridge/rtp/RtpSymmetricEndpoint.java
  • RtpPortAllocator.java
    src/main/java/com/mylinehub/voicebridge/rtp/RtpPortAllocator.java

1. RtpPacketizer.java

This class constructs outbound RTP packets manually.

Responsibilities include:

  • Maintaining sequence numbers
  • Monotonic timestamp increment (160 per 20ms for 8kHz PCM)
  • Payload encoding
  • Consistent SSRC per session

This ensures Asterisk treats AI-generated audio as a valid, continuous stream.

2. RtpSymmetricEndpoint.java

Handles symmetric RTP behavior.

  • Learns remote IP/port dynamically
  • Maintains send/receive state
  • Prevents one-way audio caused by NAT

This is critical in real-world deployments behind firewalls.

3. RtpPortAllocator.java

Production systems must avoid RTP port collisions.

This component:

  • Allocates even RTP ports
  • Ensures thread-safe reservation
  • Prevents reuse conflicts under load

Why Mixing Bridges Enable True Duplex

A mixing bridge allows multiple media streams to exist simultaneously.

  • Caller audio flows into bridge
  • ExternalMedia audio flows into bridge
  • Asterisk mixes both streams

Because no playback command is blocking, inbound audio continues even while outbound AI speech is transmitted.

How Barge-In Works in VoiceBridge

Since caller RTP is continuously streamed to the AI pipeline:

  • Speech detection runs in parallel
  • Interruption is detected instantly
  • Outbound RTP stream can be halted mid-frame

There is no need to wait for file playback completion.

Production Timing Guarantees

VoiceBridge enforces:

  • 20ms frame pacing
  • Stable clock drift control
  • Continuous timestamp increments
  • Payload consistency

If these rules are violated, Asterisk produces jitter, silence, or dropouts. The internal RTP engine prevents these conditions.

Why This Works in Production (Not Just Lab)

Many demo systems appear duplex in a controlled LAN. They fail under:

  • NAT environments
  • High call concurrency
  • Clock drift conditions
  • Packet jitter

VoiceBridge handles:

  • Symmetric RTP learning
  • Port allocation scaling
  • Bridge lifecycle cleanup
  • Session-level SSRC isolation

Comparison: AGI vs VoiceBridge Duplex

Capability AGI VoiceBridge
Simultaneous RTP No Yes
Barge-In Delayed Instant
Frame-Level Control No Yes
Bridge Orchestration No Full ARI Control

Final Summary

VoiceBridge achieves real full-duplex audio not by clever scripting, but by respecting how RTP, Asterisk bridges, and media streams actually work.

It combines:

  • ARI event-driven control
  • Mixing bridges
  • ExternalMedia RTP streaming
  • Custom RTP packetization
  • Symmetric endpoint handling
  • Production-safe port allocation

That is why it works in production — not just in demos.

Try it

Want to see API-driven CRM + Telecom workflows in action? Try the WhatsApp bot or explore the demos.

💬 Try WhatsApp Bot ▶️ Watch CRM YouTube Demos
Tip: Comment “Try the bot” on our YouTube videos to see automation in action.
M
MYLINEHUB Team
Published: 2026-02-19
Quick feedback
Was this helpful? (Yes 0 • No 0)
Reaction

Comments (0)

Be the first to comment.