Best Video Calling APIs (2026)
Embedded Video Is Now Mainstream
Telehealth appointments, virtual classrooms, real-time collaboration, customer video support, live streaming — the use cases for embedded video calling have expanded dramatically. In 2026, adding video to your application doesn't require building WebRTC infrastructure from scratch or managing media servers. Purpose-built video APIs handle the hard parts: TURN/STUN servers, codec negotiation, bandwidth adaptation, recording, and global distribution.
Three platforms lead the developer video API market: Daily (the WebRTC infrastructure specialist with per-minute transparency), Agora (the global real-time communications network optimized for low latency in Asia-Pacific), and Whereby (the easiest to embed for telehealth and non-technical integrations). Twilio Video is notably absent — Twilio deprecated it in 2024, redirecting customers to alternatives.
TL;DR
Daily is the best default for most applications — the most transparent per-minute pricing, 10K free minutes/month, and the strongest developer documentation in the video API market. Agora is the right choice for applications needing ultra-low latency in Asia-Pacific, where Agora's network infrastructure has the highest density of edge nodes. Whereby is the easiest to embed for non-technical use cases — a single <iframe> or npm package deployment for telehealth and meeting rooms.
Key Takeaways
- Twilio Video was deprecated in 2024 — teams on Twilio Video must migrate; Daily and Agora are the most common migration targets.
- Daily offers 10,000 free participant minutes/month — after which video costs $0.004/participant-minute.
- Agora offers 10,000 free minutes/month with video at $0.009/minute and voice at $0.004/minute in North America — pricing varies by region.
- Whereby Embedded starts at $6.99/month — includes 2,000 participant minutes, then $0.004/minute — better for low-volume, simple deployments.
- Participant minutes scale fast — a 60-minute call with 4 participants = 240 participant minutes, burning through free tiers in days for active applications.
- Recording adds cost — Daily charges $0.003/recorded-minute for cloud recording storage.
- HIPAA compliance is available on Daily and Whereby — critical for telehealth applications.
Pricing Comparison
At 1,000 hours of video per month (with average 2 participants):
| Platform | Free Minutes | After Free | Monthly Cost (1K hours) |
|---|---|---|---|
| Daily | 10K | $0.004/participant-minute | ~$372 |
| Agora | 10K | $0.009/participant-minute (video) | ~$836 |
| Whereby | 2K (plan) | $0.004/participant-minute | ~$382 |
| Vonage Video | Limited | Custom | Custom |
Participant minutes explained: A 30-minute call with 3 participants = 90 participant minutes. At $0.004/minute, that call costs $0.36.
Daily
Best for: Most applications, transparent pricing, HIPAA-compliant telehealth, WebRTC infrastructure
Daily is a WebRTC infrastructure platform — it provides the underlying media servers, TURN servers, and SDKs, letting you build video experiences with full UI control. Daily's approach is to be as close to raw WebRTC as possible while abstracting away the infrastructure complexity.
Pricing
| Feature | Free | Paid |
|---|---|---|
| Participant minutes | 10K/month | $0.004/minute |
| Recording | — | $0.003/recorded-minute + storage |
| HIPAA BAA | — | Business plan |
| SLA | — | Custom |
| Phone PSTN | — | $0.025/minute |
API Integration
// Daily.js — React video call
import DailyIframe from "@daily-co/daily-js";
import { useEffect, useRef } from "react";
function VideoCall({ roomUrl }) {
const wrapperRef = useRef(null);
useEffect(() => {
const callFrame = DailyIframe.createFrame(wrapperRef.current, {
showLeaveButton: true,
iframeStyle: { width: "100%", height: "100%", border: "none" },
});
callFrame
.on("joined-meeting", (event) => console.log("Joined!", event))
.on("left-meeting", () => callFrame.destroy())
.on("error", (e) => console.error("Call error:", e));
callFrame.join({ url: roomUrl });
return () => callFrame.destroy();
}, [roomUrl]);
return <div ref={wrapperRef} style={{ width: "800px", height: "600px" }} />;
}
Creating Rooms via API
import requests
# Create a room
response = requests.post(
"https://api.daily.co/v1/rooms",
headers={
"Authorization": f"Bearer {DAILY_API_KEY}",
"Content-Type": "application/json",
},
json={
"name": "patient-consult-12345",
"properties": {
"exp": int(time.time()) + 3600, # Expires in 1 hour
"enable_recording": "cloud",
"max_participants": 3,
"enable_screenshare": True,
},
},
)
room = response.json()
print(f"Room URL: {room['url']}")
Meeting Tokens for Authentication
# Create a meeting token for secure room access
token_response = requests.post(
"https://api.daily.co/v1/meeting-tokens",
headers={"Authorization": f"Bearer {DAILY_API_KEY}"},
json={
"properties": {
"room_name": "patient-consult-12345",
"user_name": "Dr. Smith",
"is_owner": True, # Host privileges
"exp": int(time.time()) + 3600,
}
},
)
token = token_response.json()["token"]
# Join URL with token: https://your-domain.daily.co/room-name?t={token}
React SDK with Hooks
import { DailyProvider, useLocalSessionId, useParticipantIds, DailyVideo } from "@daily-co/daily-react";
function VideoTile({ sessionId }) {
return <DailyVideo sessionId={sessionId} mirror={false} />;
}
function Call() {
const localSessionId = useLocalSessionId();
const participantIds = useParticipantIds({ filter: "remote" });
return (
<div>
{localSessionId && <VideoTile sessionId={localSessionId} />}
{participantIds.map((id) => <VideoTile key={id} sessionId={id} />)}
</div>
);
}
function App({ roomUrl }) {
return (
<DailyProvider url={roomUrl}>
<Call />
</DailyProvider>
);
}
When to Choose Daily
General-purpose video applications, telehealth platforms requiring HIPAA compliance, applications needing full UI control with React hooks, or teams migrating from Twilio Video (Daily is the most common migration target).
Agora
Best for: Asia-Pacific applications, ultra-low latency, interactive live streaming, large-scale events
Agora operates one of the largest real-time communication networks in the world — optimized specifically for low-latency media delivery with particular strength in Asia-Pacific, where Agora has the densest infrastructure. For applications serving users in China, Southeast Asia, or India, Agora's regional network advantage is significant.
Pricing
| Feature | Free | Paid (North America) |
|---|---|---|
| Minutes | 10K/month | — |
| Video | Free (within 10K) | $0.009/minute |
| Voice | Free (within 10K) | $0.004/minute |
| Interactive live | Custom | Custom |
Regional pricing varies — Agora charges differently by region. Asia-Pacific rates may differ from North America. For applications primarily serving Asian markets, get a custom quote.
SDK Integration
import AgoraRTC from "agora-rtc-sdk-ng";
const client = AgoraRTC.createClient({ mode: "rtc", codec: "vp8" });
let localAudioTrack;
let localVideoTrack;
async function startCall(appId, channel, token, uid) {
await client.join(appId, channel, token, uid);
// Create local audio and video tracks
[localAudioTrack, localVideoTrack] = await AgoraRTC.createMicrophoneAndCameraTracks();
const localContainer = document.getElementById("local-video");
localVideoTrack.play(localContainer);
await client.publish([localAudioTrack, localVideoTrack]);
}
// Handle remote users joining
client.on("user-published", async (user, mediaType) => {
await client.subscribe(user, mediaType);
if (mediaType === "video") {
const remoteVideoTrack = user.videoTrack;
const remoteContainer = document.getElementById(`remote-${user.uid}`);
remoteVideoTrack.play(remoteContainer);
}
});
// Leave call
async function leaveCall() {
localAudioTrack.close();
localVideoTrack.close();
await client.leave();
}
Interactive Live Streaming
Agora's strength beyond video calls is interactive live streaming — supporting thousands of audience members with specific hosts:
// Host setup for live streaming
const hostClient = AgoraRTC.createClient({ mode: "live", codec: "h264" });
await hostClient.setClientRole("host");
await hostClient.join(appId, channel, token, uid);
// Audience member
const audienceClient = AgoraRTC.createClient({ mode: "live", codec: "h264" });
await audienceClient.setClientRole("audience");
When to Choose Agora
Applications with significant Asia-Pacific user base, interactive live streaming at scale (thousands of viewers), gaming and social apps needing ultra-low latency (sub-400ms), or applications in China where Agora has CDN infrastructure competitors lack.
Whereby
Best for: Telehealth, simple embedding, no-code deployment, branded video rooms
Whereby Embedded is the easiest video API to integrate — a single <iframe> pointing to a Whereby room URL gives you a fully functional, branded video call. No media server management, no WebRTC configuration. Whereby handles everything in their infrastructure, you embed the result.
Pricing
| Plan | Cost | Minutes | Max Participants |
|---|---|---|---|
| Free | $0 | 2,000/month | 4 |
| Starter | $6.99/month | 2,000/month | 4 |
| Business | $59.99/month | Custom | 50 |
| Enterprise | Custom | Custom | Custom |
After included minutes: $0.004/participant-minute.
Simplest Possible Integration
<!-- The simplest video embed imaginable -->
<iframe
src="https://whereby.com/my-room?embed&floatSelf"
allow="camera; microphone; fullscreen; speaker; display-capture"
style="height: 100vh; width: 100%;"
></iframe>
Whereby React Component
import { useRef, useEffect } from "react";
function VideoRoom({ roomUrl }) {
const wherebyRef = useRef(null);
useEffect(() => {
if (!wherebyRef.current) return;
const elm = wherebyRef.current;
elm.setAttribute("room", roomUrl);
elm.setAttribute("display-name", "Patient");
elm.setAttribute("floating-self-view", "on");
}, [roomUrl]);
return (
<whereby-embed
ref={wherebyRef}
style={{ width: "100%", height: "600px" }}
/>
);
}
Create Rooms via API
import requests
response = requests.post(
"https://api.whereby.dev/v1/meetings",
headers={
"Authorization": f"Bearer {WHEREBY_API_KEY}",
"Content-Type": "application/json",
},
json={
"endDate": "2026-03-08T23:59:59Z",
"fields": ["hostRoomUrl"],
"roomNamePrefix": "dr-smith-clinic",
"roomMode": "normal", # or "group"
},
)
meeting = response.json()
print(f"Room URL: {meeting['roomUrl']}")
print(f"Host URL: {meeting['hostRoomUrl']}") # Host gets special controls
HIPAA Compliance
Whereby is HIPAA-eligible on Business and Enterprise plans, making it a natural fit for telehealth applications that need video without managing WebRTC infrastructure.
When to Choose Whereby
Telehealth platforms, educational applications, teams that want video without WebRTC complexity, non-technical teams deploying video into HTML pages, or any use case where <iframe> embedding is sufficient.
Platform Comparison
| Feature | Daily | Agora | Whereby |
|---|---|---|---|
| Free minutes | 10K/month | 10K/month | 2K/month |
| Video pricing | $0.004/min | $0.009/min (N. America) | $0.004/min |
| HIPAA compliant | Yes | With BAA | Yes (Business+) |
| Embedding simplicity | Medium (SDK) | Medium (SDK) | Very easy (iframe) |
| UI control | Full | Full | Limited |
| Recording | Yes | Yes | Yes |
| Interactive streaming | Limited | Yes (core feature) | No |
| Asia-Pacific optimization | Good | Best | Good |
| Screen sharing | Yes | Yes | Yes |
| Twilio Video migration | Best documented | Good | Limited |
WebRTC Architecture and What Video APIs Abstract Away
Building video calling on raw WebRTC requires solving a set of infrastructure problems that are genuinely difficult: STUN/TURN server operation, signaling server implementation, peer connection negotiation, codec selection and transcoding, bandwidth adaptation under network constraints, and SFU (Selective Forwarding Unit) design for multi-party calls.
STUN (Session Traversal Utilities for NAT) servers help peers discover their public IP addresses for direct peer-to-peer connection. TURN (Traversal Using Relays around NAT) servers relay media traffic when direct peer-to-peer is impossible — behind symmetric NATs, corporate firewalls, or mobile networks. Approximately 15-20% of connections require TURN relay; without TURN servers, those calls fail silently. Running your own TURN servers means managing global infrastructure in every region where users exist, configuring TLS, and paying bandwidth costs for relayed traffic.
An SFU (Selective Forwarding Unit) is the server architecture that enables group calls with more than 2 participants. Without an SFU, each participant sends video to every other participant (mesh topology) — bandwidth scales as O(n²). An SFU receives one stream from each participant and selectively forwards appropriate streams to others — bandwidth scales as O(n). For 4+ participant calls, SFU is required. Video calling APIs provide SFU infrastructure globally; building your own means managing SFU capacity planning, geographic distribution, and failover.
Daily, Agora, and Whereby abstract all of this away. Their SDKs handle STUN/TURN negotiation, signaling, codec selection (VP8/VP9/H.264/AV1), bandwidth adaptation, and SFU routing transparently. The per-minute pricing includes the infrastructure cost — TURN bandwidth, SFU compute, media relay — that would otherwise be separate line items in your cloud bill.
The remaining application logic you still write: room management (creating/destroying rooms, access tokens), UI components (participant video tiles, mute controls, device selection), and integration with your authentication system. Daily's Prebuilt UI and Whereby's iframe embed handle even the UI layer if you want it — you trade customization control for integration speed.
Recording, Transcription, and Post-Processing
Call recording adds significant value for customer support, healthcare consultations, education, and compliance — and adds complexity to the implementation.
Cloud recording (server-side) vs. local recording (client-side) are architecturally different. Cloud recording runs on the API provider's servers: they receive all media streams, composite them into a single recording, and deliver a video file. This works regardless of whether a participant loses connection mid-call. Local recording uses the browser's MediaRecorder API to record from the participant's perspective — simpler to implement but loses segments when connections drop and requires file upload after the call.
Daily's cloud recording composites participant video into a single MP4 at $0.003/recorded-minute (in addition to participation costs). Agora cloud recording at $0.018/minute (audio+video) includes layout templates (side-by-side, spotlight, grid). Whereby doesn't offer cloud recording on standard plans.
Transcription requires a separate step after recording. The generated MP4 must be passed to a speech-to-text API (Assembly AI, Deepgram, Whisper via OpenAI or local deployment) to produce a transcript. Assembly AI processes video files directly; Deepgram accepts streaming audio for real-time transcription during the call via a WebSocket connection. Real-time transcription (transcribing during the call rather than after) requires additional WebRTC integration to route audio to the transcription service in parallel with the call.
Meeting summaries and action items (extracting decisions and tasks from a transcript) are a common downstream use case for AI summarization. The typical pipeline: recording ends → cloud recording webhook fires → MP4 uploaded to S3 → transcription job submitted → transcript stored → LLM summarization triggered → summary delivered to participants. This pipeline takes 2-5 minutes after call end for a 1-hour meeting, which is typically acceptable. Real-time extraction (live AI note-taking during the call) requires streaming transcription and is supported by Deepgram's real-time API with sub-second latency.
HIPAA compliance for telehealth recordings requires Business Associate Agreements (BAA) with both the video API provider and any transcription or storage services that handle PHI. Daily offers a HIPAA-compatible plan with a BAA. Most general-purpose cloud storage (AWS S3, GCS) also has HIPAA compliance available under BAA — the configuration of encryption, access controls, and audit logging is your responsibility.
Decision Framework
| Scenario | Recommended |
|---|---|
| Telehealth (HIPAA) | Daily or Whereby |
| Asia-Pacific primary market | Agora |
| Simplest possible embed | Whereby |
| Full UI control via SDK | Daily or Agora |
| Interactive live streaming | Agora |
| Migrating from Twilio Video | Daily |
| < 2K minutes/month | Whereby (free) |
| Social/gaming low-latency | Agora |
Verdict
Daily is the right default for most video calling applications in 2026. The transparent $0.004/participant-minute pricing, 10K free minutes, comprehensive React SDK, and HIPAA compliance make it the best starting point for both telehealth and general video applications. The migration documentation from Twilio Video is also the most complete in the market.
Agora wins when Asia-Pacific latency is the primary constraint or when you need interactive live streaming at scale. The regional network advantage in Asia is measurable and significant for user experience.
Whereby is right when simplicity trumps flexibility. An <iframe> integration that works in 15 minutes with no media server knowledge required is genuinely valuable for teams that want video calling without becoming WebRTC experts.
One migration note worth emphasizing: Twilio Video's 2024 deprecation left a meaningful number of applications needing to migrate. Daily has the most complete Twilio Video migration documentation and the closest API design to what Twilio Video offered. Teams migrating from Twilio should treat the migration as an opportunity to evaluate all three options rather than defaulting to the nearest-seeming replacement — Agora's pricing and latency profile is significantly different from Twilio's, and Whereby's simplicity may cover use cases where Twilio's flexibility was never needed.
Compare video calling API pricing, documentation, and integration guides at APIScout — find the right video platform for your application.
Related: Best Video Hosting APIs for Developers, Cloudflare Stream vs api.video, Add Video Streaming with Mux in 2026