VoiceBridge

VoiceBridge vs AGI-Based Voice Bots — Turn-Based vs Real-Time

MYLINEHUB Team • 2026-02-11 • 11 min

Why AGI bots feel like IVR and why VoiceBridge enables real conversational duplex voice, including barge-in and streaming responses.

VoiceBridge vs AGI-Based Voice Bots — Turn-Based vs Real-Time

VoiceBridge vs AGI-Based Voice Bots — Turn-Based vs Real-Time Duplex

Many Asterisk voice bots start with AGI (Asterisk Gateway Interface) because it’s simple: execute script → play prompt → record audio → send to STT → generate response → play audio → repeat.

That model works for IVR-style automation. It does not produce real-time, interruption-capable, conversational duplex AI.

This article explains the architectural difference between:

  • AGI-based, turn-based voice bots
  • VoiceBridge (ARI + ExternalMedia + RTP duplex engine)

VoiceBridge repository:
https://github.com/mylinehub/omnichannel-crm/tree/main/mylinehub-voicebridge

The Core Architectural Difference

AGI Model (Turn-Based)

AGI works as a blocking script execution model inside the dialplan.

  1. Dialplan calls AGI script
  2. Script plays audio file
  3. Script records caller input
  4. Script sends audio to STT
  5. Script generates reply
  6. Script plays next audio file

The call alternates between: bot speakingcaller speaking.

This is fundamentally half-duplex.

VoiceBridge Model (Real-Time Duplex)

VoiceBridge does not use AGI for media. It uses:

  • ARI for event-driven call control
  • ExternalMedia for RTP capture and injection
  • A dedicated RTP engine for timing correctness

The caller and bot can speak simultaneously. Audio flows continuously in both directions.

Why AGI Is Structurally Turn-Based

AGI executes synchronously inside the dialplan.

  • Playback() blocks until audio finishes
  • Record() blocks until silence or timeout
  • No continuous RTP streaming

During playback:

  • Caller interruption is delayed
  • Barge-in requires hacky DTMF or polling tricks
  • Speech overlap is not naturally supported

Even FastAGI does not change the media blocking model.

VoiceBridge Duplex Media Architecture

VoiceBridge constructs a real media graph using ARI:

  • ari/impl/AriBridgeImpl.java — builds mixing bridges
  • ari/impl/ExternalMediaManagerImpl.java — creates RTP channels

RTP capture and injection are handled in:

  • rtp/RtpPacketizer.java
  • rtp/RtpSymmetricEndpoint.java
  • rtp/RtpPortAllocator.java

Audio is continuously streamed both directions.

Latency Comparison

AGI

  • Playback must finish before recording
  • STT only starts after recording stops
  • Full round-trip delay per turn

This creates noticeable conversational lag.

VoiceBridge

  • Caller audio streamed continuously
  • AI processes in parallel
  • TTS streamed back while caller still active
  • Immediate truncation on interruption

Barge-in control implemented in: ai/impl/OpenAiRealtimeTruncateManager.java

Barge-In Capability

AGI

  • No natural barge-in
  • Requires DTMF detection tricks
  • Playback usually must finish

VoiceBridge

  • Speech detection runs continuously
  • Outbound RTP stream can be stopped mid-frame
  • Bot audio truncation is deterministic

Controlled by: ai/TruncateManager.java

Media Quality and Timing

AGI

  • Relies on file playback
  • No control over RTP pacing
  • Limited real-time adjustments

VoiceBridge

  • Strict RTP timestamp control
  • Sequence integrity maintained
  • SSRC stability enforced

Implemented in: rtp/RtpPacketizer.java

Scalability Differences

AGI Scaling Characteristics

  • Each call blocks script execution
  • Heavy process/thread usage under load
  • Limited observability into media layer

VoiceBridge Scaling Characteristics

  • Per-call session object: session/CallSession.java
  • Deterministic RTP port allocation
  • Containerized deployment supported: docker/Dockerfile, docker-compose.yml
  • Kubernetes-ready scaling (horizontal replicas)

Operational Stability

AGI Risks

  • Script crash ends call abruptly
  • Blocking I/O delays entire call flow
  • Difficult RTP-level debugging

VoiceBridge Stability

  • Explicit session lifecycle management
  • Graceful cleanup on hangup
  • Separation of control plane (ARI) and media plane (RTP)

When AGI Is Still Appropriate

  • Menu-based IVR systems
  • DTMF-driven automation
  • Simple turn-based bots

If your goal is structured prompts and short responses, AGI is perfectly fine.

When VoiceBridge Is Required

  • Natural conversational AI
  • Interruption support (barge-in)
  • Simultaneous speak/listen
  • Low-latency real-time dialog
  • Hundreds of concurrent AI calls

Final Comparison Summary

Feature AGI Bot VoiceBridge
Duplex Audio No (Turn-Based) Yes (Real-Time)
Barge-In Limited Immediate
RTP Control None Full Control
Scalability Script-bound Service-Oriented
Latency Model Per-Turn Continuous

AGI is ideal for traditional IVR logic. VoiceBridge is built for modern conversational AI where real-time duplex behavior is mandatory.

Repo:
https://github.com/mylinehub/omnichannel-crm/tree/main/mylinehub-voicebridge

Try it

Want to see API-driven CRM + Telecom workflows in action? Try the WhatsApp bot or explore the demos.

💬 Try WhatsApp Bot ▶️ Watch CRM YouTube Demos
Tip: Comment “Try the bot” on our YouTube videos to see automation in action.
M
MYLINEHUB Team
Published: 2026-02-11
Quick feedback
Was this helpful? (Yes 0 • No 0)
Reaction

Comments (0)

Be the first to comment.