Most "what is WebRTC" articles start with a metaphor about telephones and end with a perfect-world code snippet that won't run in production. This post tries to be more useful: enough concept to know what's happening, then real code with the parts that bite you in real deployments called out.
What WebRTC Actually Is
WebRTC is a set of browser APIs and underlying protocols that let two endpoints, usually browsers, sometimes mobile apps or native code, exchange audio, video, or arbitrary data directly. The "directly" is the interesting part. Before WebRTC, real-time media on the web meant Flash plugins, Java applets, or proprietary servers in the middle of every call. WebRTC moved the media path off your servers and into the peers themselves.
The W3C and IETF started standardising it around 2011. Google open-sourced its codecs and ICE implementation. By 2017 every major browser shipped it. Today, every video call you take in a browser, Google Meet, Discord on the web, every modern customer-support widget, every cheap telehealth tool, runs on top of WebRTC.
Three Things WebRTC Gives You
Strip away the marketing and there are three actual capabilities:
- getUserMedia(), programmatic access to the camera and microphone, with the user's permission.
- RTCPeerConnection, the pipe that moves media between two endpoints, including all the NAT-traversal and codec-negotiation machinery.
- RTCDataChannel, a bidirectional, low-latency channel for arbitrary data. Built on top of SCTP-over-DTLS-over-UDP. Useful for multiplayer games, file transfer, and anything else that wants sub-millisecond P2P delivery.
That's the whole API surface for application developers. Everything else, the codec negotiation, the bandwidth estimation, the encryption, the browser handles for you.
The One Thing WebRTC Doesn't Give You
Signaling. Two peers cannot find each other without help. WebRTC deliberately leaves the question of "how do I tell peer B about peer A's session description?" up to you. WebSockets, HTTP polling, a signaling broker, any transport works. You write that part.
This is where most beginner tutorials lose people. They show you
createOffer and setRemoteDescription but skim the
"send the offer to the other peer" step, because that step is your
problem.
For local experimentation you can copy-paste the SDP between two browser tabs by hand. For real apps you need a tiny WebSocket server.
A Minimal Working Example
Here's a single-tab demo where two peer connections in the same page hand-shake with each other and open a data channel. No signaling server needed because both peers live in the same JavaScript context, but it exercises the full peer-connection lifecycle.
const turnConfig = {
iceServers: [
{ urls: 'stun:stun.expressturn.com:3478' },
{
urls: [
'turn:relay1.expressturn.com:3478?transport=udp',
'turns:relay1.expressturn.com:443?transport=tcp'
],
username: 'YOUR_TURN_USERNAME',
credential: 'YOUR_TURN_PASSWORD'
}
]
};
async function demo() {
const a = new RTCPeerConnection(turnConfig);
const b = new RTCPeerConnection(turnConfig);
// Wire ICE candidates between the two
a.onicecandidate = e => e.candidate && b.addIceCandidate(e.candidate);
b.onicecandidate = e => e.candidate && a.addIceCandidate(e.candidate);
// Data channel from A to B
const dc = a.createDataChannel('chat');
dc.onopen = () => dc.send('hello from A');
b.ondatachannel = e => {
e.channel.onmessage = m => console.log('B received:', m.data);
};
// Offer/answer dance
const offer = await a.createOffer();
await a.setLocalDescription(offer);
await b.setRemoteDescription(offer);
const answer = await b.createAnswer();
await b.setLocalDescription(answer);
await a.setRemoteDescription(answer);
}
demo();
Open the console and you'll see "B received: hello from A". That's WebRTC in seventeen lines.
What Breaks in Production
The example above works on every browser. The same code, between two real users on different networks, fails roughly 20% of the time without a TURN server. That's the part the tutorials gloss over.
The reason: ICE, Interactive Connectivity Establishment, is the algorithm that figures out how to actually move packets between the two peers. It tries a list of candidate addresses in priority order:
- Host candidates (LAN addresses, when both peers are on the same network).
- Server-reflexive candidates (the public IP your NAT exposes, discovered via STUN).
- Relayed candidates (a TURN server forwarding packets between you).
Most of the time, host or server-reflexive candidates work. But for users behind symmetric NATs, carrier-grade NATs, hotel Wi-Fi, or strict corporate firewalls, only the relayed candidate gets through. Without TURN configured, those users see "Connection failed" and bounce.
Codecs You Don't Have to Choose
WebRTC mandates Opus for audio and either VP8 or H.264 for video. Browsers negotiate which codec each pair will use through SDP. You almost never need to think about codec selection, the browser picks one that both sides support, sets up the encoder, and keeps adjusting bitrate based on network conditions.
The exception: simulcast. If you're building anything more elaborate than a one-on-one call, you'll want each publisher to send three layers of the same video at different bitrates so the SFU can selectively forward the right one to each viewer. That's a topic for another post.
Encryption Is Mandatory
Every WebRTC media stream is encrypted with DTLS-SRTP. There's no "unencrypted" mode. The browsers won't let you turn it off, and a TURN server in the middle of the connection cannot decrypt the relayed packets, it sees opaque encrypted bytes and forwards them.
This matters for compliance reasons (telehealth, finance, anywhere PHI or PII flows) and for general user trust. WebRTC was designed end-to-end encrypted from day one.
Where to Go Next
If you want to actually ship something, your reading list:
- The MDN
RTCPeerConnectiondocs are surprisingly good and include working examples. - "WebRTC for the Curious" (free book, webrtcforthecurious.com), best deep dive on the protocol details.
- Pick an SFU if you need more than two peers per call: mediasoup, Janus, LiveKit, or Pion are the four production-grade options.
- Pick a TURN provider so your "Connection failed" rate drops to zero, ExpressTURN has 1 TB/month free if you want to start without commitment.
WebRTC isn't simple, but it's not magic either. The hard parts, NAT traversal, codec negotiation, congestion control, are mostly handled for you. Your job is to wire up the signaling, configure TURN, and stay out of the browser's way.