What Is SDP in VoIP? Meaning, Structure, and Why It Matters
Understand what SDP is in VoIP, how it works with SIP and RTP, what information it carries, and why it matters for codec negotiation, media routing, and troubleshooting one-way audio.
What Is SDP in VoIP? Meaning, Structure, Examples, and Why It Matters
Many people first learn SIP, RTP, WebRTC, or Asterisk — and then suddenly start seeing SDP inside SIP traces, browser console logs, PBX debug output, SBC captures, or Wireshark. That is usually the exact moment VoIP starts feeling confusing. 😅
This article fixes that confusion properly.
We go from absolute beginner → practical engineer → advanced troubleshooting mindset, while keeping the explanation friendly and visual.
In this guide, you will understand:
• what SDP is and what it is not
• how SDP compares with SIP, RTP, RTCP, HTTP, and WebRTC
• where SDP appears in a real call flow
• how offer/answer negotiation actually works
• what each SDP line means in plain English
• how codecs, ports, IP addresses, hold state, media directions, and encryption are described
• why issues like one-way audio, no audio, codec mismatch, broken ICE, and re-INVITE surprises often lead back to SDP
🚀 First, the simplest explanation possible
SDP stands for Session Description Protocol.
But the full form alone does not help much.
The practical meaning is this:
SDP is a text-based description of how media should be exchanged in a communication session.
In a call, SDP answers questions like:
• what kind of media is being offered: audio, video, application data
• which codecs are supported: Opus, PCMU, PCMA, G.722, H.264, VP8, and so on
• which IP address and port should receive media
• whether the side can send, receive, both, or neither
• what transport/security profile is expected
• which WebRTC details are needed: ICE, DTLS fingerprint, BUNDLE, RTCP mux, candidates
🧠 One mental model that makes SDP easy
Think of a call like this:
| Thing | Easy meaning | What it actually does |
|---|---|---|
| SIP | Starts the conversation | Creates, changes, or ends the session |
| SDP | Explains how media should work | Lists codecs, ports, IPs, directions, media/security details |
| RTP | Carries the real voice/video | Delivers media packets continuously |
| RTCP | Reports media quality | Shares stats like jitter, packet loss, timing, reports |
The shortest memory trick is: SIP says “let’s talk”, SDP says “here’s how”, RTP says “here is the media”, and RTCP says “here is how well it is going”.
🏗️ Which layer does SDP belong to?
This confuses many beginners because SDP talks about IP addresses, ports, codecs, and media transport, so it looks “low-level”. But SDP itself is not a transport-layer protocol like UDP, and it is not a network-layer protocol like IP.
The clean answer is:
SDP is best understood as an application-layer session description format or payload.
In practical terms:
• IP moves packets between hosts
• UDP/TCP/TLS transport application bytes between endpoints
• SIP / HTTP / WebSocket may carry signaling or session-related data
• SDP is the structured media description used by those systems to describe the session
🔗 Which protocols and systems use SDP?
SDP does not usually work alone. It is commonly carried inside another signaling flow.
| System / Protocol | How SDP is used | Typical place you see it |
|---|---|---|
| SIP | Classic VoIP use case; SDP appears in INVITE, 183, 200 OK, UPDATE, re-INVITE | SIP traces, PBX logs, SBC captures |
| WebRTC | Offer/answer exchanged over app signaling like HTTP, WebSocket, or custom APIs | Browser console, JS app logs, WebRTC debug tools |
| RTSP / streaming systems | Used to describe media streams and playback/session details | Streaming servers and media endpoints |
| SBCs / gateways | Often modify, normalize, or relay SDP across network boundaries | Carrier interconnects, NAT handling, transcoding paths |
Important practical point: SDP is often embedded in another message body. That is why engineers usually say “SIP INVITE containing SDP” instead of “SDP packet” in the strict sense.
📦 Is SDP a packet?
Strictly speaking, SDP is not a packetized media protocol like RTP.
RTP has packet headers, timestamps, sequence numbers, payload types, SSRC values, and actual media payloads. SDP is different. SDP is a structured text document.
In SIP, SDP usually appears as the body after headers like:
Content-Type: application/sdp Content-Length: 245
And then the body contains SDP lines like this:
v=0 o=- 3747 3747 IN IP4 192.168.1.10 s=VoIP Call c=IN IP4 192.168.1.10 t=0 0 m=audio 49170 RTP/AVP 0 8 101 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-16 a=sendrecv
🔍 Read a real SDP body line by line
v=0 o=- 3747 3747 IN IP4 192.168.1.10 s=VoIP Call c=IN IP4 192.168.1.10 t=0 0 m=audio 49170 RTP/AVP 0 8 101 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-16 a=sendrecv
| Line | Meaning | Why it matters |
|---|---|---|
v=0 |
SDP version | Almost always 0 in modern deployments |
o=- 3747 3747 IN IP4 192.168.1.10 |
Origin line | Identifies creator, session id, version, network type, and address type |
s=VoIP Call |
Session name | Often not critical in voice calls, but valid grammar |
c=IN IP4 192.168.1.10 |
Connection line | Tells the remote side which IP address is associated with media |
t=0 0 |
Timing line | Usually means no special scheduled time window |
m=audio 49170 RTP/AVP 0 8 101 |
Media description | Defines media type, receiving port, transport profile, and payload types |
a=rtpmap:0 PCMU/8000 |
Payload type mapping | Maps PT 0 to G.711 μ-law at 8 kHz |
a=rtpmap:8 PCMA/8000 |
Payload type mapping | Maps PT 8 to G.711 A-law at 8 kHz |
a=rtpmap:101 telephone-event/8000 |
DTMF event mapping | Used for RTP-based keypad tones |
a=fmtp:101 0-16 |
Format parameters | Defines how that payload should behave |
a=sendrecv |
Direction attribute | Means both sides can send and receive media |
🧾 The most important SDP fields you will keep seeing
When engineers troubleshoot calls, these are the lines they keep coming back to:
| Prefix | Role | Example | Why engineers care |
|---|---|---|---|
v= |
Version | v=0 |
Basic SDP grammar field |
o= |
Origin | o=- 123 456 IN IP4 10.0.0.5 |
Session/version changes can matter during re-INVITEs |
s= |
Session name | s=Call |
Usually not a major troubleshooting field |
c= |
Connection address | c=IN IP4 203.0.113.10 |
Wrong IP here can cause one-way audio or no audio |
t= |
Time | t=0 0 |
Normally simple in VoIP calls |
m= |
Media line | m=audio 49170 RTP/AVP 0 8 101 |
Defines media type, port, transport profile, payload list |
a=rtpmap |
Codec mapping | a=rtpmap:111 opus/48000/2 |
Maps payload number to actual codec |
a=fmtp |
Codec parameters | a=fmtp:111 minptime=10;useinbandfec=1 |
Important for codec compatibility |
a=sendrecv, sendonly, recvonly, inactive |
Media direction | a=sendonly |
Explains hold state or missing media direction |
a=rtcp-mux |
RTCP multiplexing | a=rtcp-mux |
Common in WebRTC |
a=ice-ufrag, a=ice-pwd, a=candidate |
ICE details | a=candidate:... |
Critical for NAT traversal in WebRTC |
a=fingerprint |
DTLS fingerprint | a=fingerprint:sha-256 ... |
Needed for secure media setup in WebRTC/SRTP flows |
🪜 Beginner → intermediate → advanced understanding
Beginner level
SDP is the text that explains how audio or video should happen in a call. It says which codecs can be used and where the media should be sent.
Intermediate level
SIP often carries SDP. SDP lists codecs, media types, directions, ports, and addresses. After both sides agree, RTP starts sending the actual voice based on that negotiated information.
Advanced level
SDP is an application-layer session description payload used inside offer/answer negotiation. It influences codec selection, media routing, hold/resume behavior, NAT traversal, SRTP/DTLS negotiation, ICE connectivity, multiplexing, and mid-call renegotiation in SIP and WebRTC systems.
Expert mindset
When media fails, do not stop at “the SIP call connected”. Inspect the actual negotiated SDP, compare offer and answer, validate codec intersection, confirm reachable candidate/path selection, verify media directions, and confirm that the final IP:port/security profile is really usable end-to-end.
🆚 SDP vs SIP vs RTP vs RTCP vs HTTP
| Protocol / Format | Primary role | Carries live voice? | Human analogy | Typical example |
|---|---|---|---|---|
| HTTP | Request/response web transport | No | Fetching a webpage or API response | GET /users, POST /login |
| SIP | Session signaling | No | Arranging the call | INVITE, 180 Ringing, 200 OK, BYE |
| SDP | Media description | No | The agreement sheet | Codecs, ports, IPs, directions, ICE |
| RTP | Live media transport | Yes | The actual sound/video stream | Voice frames every 20 ms |
| RTCP | Media control/reporting | No | Quality feedback and timing reports | Jitter, packet loss, sender reports |
🛠️ A SIP message carrying SDP
Here is a simplified SIP INVITE showing how SDP appears inside the message body:
INVITE sip:1002@pbx.example.com SIP/2.0 Via: SIP/2.0/UDP 192.168.1.10:5060;branch=z9hG4bK-12345 From: "Alice" <sip:1001@pbx.example.com>;tag=abc123 To: <sip:1002@pbx.example.com> Call-ID: 9f8e7d6c@example.com CSeq: 1 INVITE Contact: <sip:1001@192.168.1.10:5060> Content-Type: application/sdp Content-Length: 245 v=0 o=- 3747 3747 IN IP4 192.168.1.10 s=VoIP Call c=IN IP4 192.168.1.10 t=0 0 m=audio 49170 RTP/AVP 0 8 101 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-16 a=sendrecv
So when people say “SIP call setup”, they are often really looking at SIP + SDP together. SIP manages the session; SDP describes the media plan inside it.
🎙️ How codec negotiation really works
Codec negotiation is not magic. It is basically a matching problem.
Suppose the offer contains:
m=audio 49170 RTP/AVP 111 0 8 a=rtpmap:111 opus/48000/2 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000
And the remote side only supports:
m=audio 52000 RTP/AVP 0 a=rtpmap:0 PCMU/8000
Then the negotiated result is usually PCMU, because that is the common codec both sides accept.
Key principle: the final codec is normally chosen from the intersection of supported codecs, not simply from the first codec listed by one side.
If there is no common codec, the signaling may proceed partially, but media can fail, fallback, or be rejected depending on the system.
🔁 Media directions: sendrecv, sendonly, recvonly, inactive
One of the most important beginner-to-advanced concepts is media direction. The direction attributes decide whether a stream is active for sending, receiving, both, or neither.
| Attribute | Meaning | Common real-world usage |
|---|---|---|
a=sendrecv |
Can send and receive | Normal two-way call |
a=sendonly |
Will send but not receive | Hold scenarios or special media paths |
a=recvonly |
Will receive but not send | Hold scenarios, monitor/listener paths |
a=inactive |
Neither send nor receive | Paused or inactive stream |
This is why a call can “connect” but one side hears nothing: the negotiated direction may not actually allow two-way audio.
🌐 Why IP address and port details are such a big deal
Many “audio problems” are actually address/port problems.
The SDP might say:
c=IN IP4 192.168.1.10 m=audio 49170 RTP/AVP 0 8 101
That means: “send audio to 192.168.1.10 port 49170”.
But if the remote party is outside your LAN, and that private IP is not reachable from outside, the remote side may send RTP into a black hole. The SIP call itself may still look fine.
Translation: a perfectly valid SDP line can still be operationally wrong if the IP/port it advertises is not reachable from the other side.
🔐 How SDP becomes more advanced in WebRTC
In classic SIP telephony, SDP may look fairly small. In WebRTC, SDP becomes much richer because the browser needs far more detail to establish secure media across NATs and changing networks.
WebRTC SDP often contains:
• ICE usernames and passwords
• ICE candidates
• DTLS fingerprints
• setup roles like actpass
• rtcp-mux
• BUNDLE group and media IDs
• secure transport profiles like UDP/TLS/RTP/SAVPF
• richer codec and header-extension attributes
m=audio 9 UDP/TLS/RTP/SAVPF 111 0 8 a=mid:0 a=sendrecv a=rtcp-mux a=ice-ufrag:8hY2 a=ice-pwd:Qm8sLk2jP0mYx2c1 a=fingerprint:sha-256 12:34:56:78:... a=setup:actpass a=rtpmap:111 opus/48000/2 a=fmtp:111 minptime=10;useinbandfec=1 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=candidate:1 1 UDP 2130706431 192.168.1.10 54400 typ host
So if classic SIP SDP feels like a media address card, WebRTC SDP feels more like a full connection workbook. 📘
🧊 SDP and NAT traversal: why WebRTC depends so much on ICE
In simple LAN voice systems, an IP and port may be enough. But on the public internet, endpoints are often behind NAT, firewalls, VPNs, home routers, or mobile carrier networks.
That is where ICE comes in. ICE allows a side to advertise multiple possible network candidates and then test which path actually works.
| ICE concept | Easy meaning | Why it matters |
|---|---|---|
| Host candidate | Local interface address | Works if directly reachable |
| Server reflexive candidate | Public-facing NAT-mapped address | Helps internet peers reach you |
| Relay candidate | TURN-relayed path | Used when direct connectivity fails |
This is why a WebRTC call may “signal fine” but still have no audio: the SDP exchange succeeded, but ICE never found a usable media path.
🧪 How engineers troubleshoot SDP in real systems
When a call fails, experienced engineers often go through this checklist:
1) Is the codec actually negotiated?
Do both sides have at least one common codec? Is the chosen codec compatible with policy, license, transcoding rules, and media engine capability?
2) Is the IP address correct and reachable?
Look at the c= line or candidate set.
NAT problems often show up as private, stale, or unusable addresses.
3) Is the advertised port reachable?
SDP may advertise the right logical port, but firewall policy, NAT pinholes, or media proxy issues may still block it.
4) Is the media direction wrong?
recvonly,
sendonly, or
inactive
may explain one-way audio or hold behavior.
5) In WebRTC, did ICE actually connect?
A clean SDP exchange does not guarantee media success. The selected ICE candidate pair still has to work.
6) Did re-INVITE or UPDATE change media later?
Mid-call renegotiation can silently change codec, port, direction, or secure media setup after the call already started.
🔄 Re-INVITE, UPDATE, and mid-call changes
A common beginner mistake is thinking SDP only matters at the start of the call. In reality, SDP may change during the session too.
Mid-call SDP changes can be used for:
• hold and resume
• changing codecs or adding/removing streams
• switching media endpoints or proxies
• upgrading or modifying security state
• enabling video after audio already started
That is why “it worked for 20 seconds and then audio died” can still be an SDP problem: the initial negotiation was okay, but the later one was not.
🧩 What SDP commonly describes in real systems
In practical deployments, SDP commonly describes these things:
🎙️ Media type
audio / video / application
🎚️ Codec support
Opus, PCMU, PCMA, G.722, AMR, H.264, VP8, telephone-event, and more
🌐 Media destination
which IP address or candidates are associated with media
🚪 Port information
which UDP port is expected for RTP or SRTP
🔁 Direction
sendrecv / sendonly / recvonly / inactive
🔐 Security information
DTLS fingerprint, SRTP transport profile, setup role, secure media expectations
🧊 NAT traversal data
ICE credentials, candidates, bundle groups, mids, multiplexing data
📞 Telephone-event / DTMF
how keypad events are transmitted during the call
📚 A practical glossary of important SDP terms
| Term | Meaning | Practical note |
|---|---|---|
| Payload type | Numeric id used in RTP/SDP to refer to a codec or event format | Static or dynamic; often mapped using a=rtpmap |
| rtpmap | Maps payload type to codec name/rate/channels | Example: a=rtpmap:111 opus/48000/2 |
| fmtp | Extra codec parameters | Can affect interoperability even if codec names match |
| ICE | Interactive Connectivity Establishment | Tests candidate network paths, especially in WebRTC |
| Candidate | One possible reachable address/port path | Host, reflexive, or relay |
| Fingerprint | DTLS certificate fingerprint | Needed to validate secure peer identity for media |
| BUNDLE | Multiple media streams share one transport flow | Very common in WebRTC |
| mid | Media identification tag | Helps associate streams in bundled sessions |
🧪 Mini troubleshooting examples
Case 1: Call connects, but no audio
SIP is fine, but SDP advertised a private IP that the far end cannot reach. RTP is sent to an unreachable address.
Case 2: One-way audio
One side can receive media, but the return path is blocked or the reverse SDP answer points to the wrong address/port.
Case 3: Browser says connected, but remote hears nothing
ICE candidates may be wrong, incomplete, or never selected successfully, even though offer/answer exchange finished.
Case 4: Hold/resume behaves strangely
A re-INVITE may have changed media direction to sendonly,
recvonly, or inactive.
❓ Quick FAQ
Is SDP the same as SIP?
No. SIP handles session signaling. SDP is the media description usually carried inside SIP or another signaling channel.
Does SDP carry the actual voice?
No. RTP or SRTP carries the real voice packets. SDP only describes how the media should be exchanged.
Why do I see SDP in browser logs?
Because WebRTC uses SDP heavily for offer/answer, codec negotiation, ICE, DTLS, and stream description.
Can a call connect even if SDP is wrong?
Yes. The signaling may complete while the media path still fails. That is why you can get ringing, answer, and call timers — but still have no audio.
📝 Beginner-to-expert summary
Here is the clean progression to remember:
Beginner 🔹
SDP is the text that says how media should happen in a call.
Intermediate 🔹
SIP often carries SDP. SDP lists codecs, IPs, ports, media types, and direction. RTP then sends the actual audio or video using that agreed information.
Advanced 🔹
SDP is an application-layer session description payload used in offer/answer negotiation. It affects codec selection, media routing, hold state, NAT traversal, secure media, and mid-call changes.
Expert 🔹
In real troubleshooting, you must validate the final negotiated SDP, not just the signaling success. The true question is:
what media path was actually negotiated, and was that path truly usable?
✅ Final takeaway
If SIP is the conversation about starting a call, and RTP is the real sound of the call, then SDP is the agreement sheet that tells both sides how that sound should be exchanged.
It is not “just a full form”. It is one of the most practical concepts in VoIP, SIP, WebRTC, Asterisk, FreePBX, SBCs, gateways, cloud calling, and AI voice systems.
Once you truly understand SDP, many telecom problems stop looking random. They become traceable: codec issue, IP issue, port issue, direction issue, ICE issue, DTLS issue, or negotiation issue. 🎯
Want to see API-driven CRM + Telecom workflows in action? Try the WhatsApp bot or explore the demos.
Comments (0)
Be the first to comment.