One substrate · every modality

Transform your transport.

The same bare-metal QUIC substrate that runs realtime voice, your robots, your browser, your game, your phone call.

scroll
one box · 8 cores
voiceinferencewebrtcsiproboticsgames
01 · The substrate

It starts with one packet.

A single unit of your data, on the wire.

One live connection.

That packet is one of thousands on a single connection.

One core.

Every core runs its own connections, full tilt.

0

messages/second on one 8-core box — and improving. 36,732 concurrent connections at 0.4% CPU.

One substrate.

Every box, every modality, on the same transport.

AI
TV
noise
caller
coworker
02 · Voice AI

It starts with one sample.

A scrap of sound, captured the instant it's spoken.

Which becomes speech.

Words on the wire, in real time.

A live call.

Caller, model, agent — one pipe, both directions.

It hears the whole room.

The line tells the caller from the TV, a coworker, or background noise — and listens to the right one.

Thousands of calls at once.

On every provider, with first-class barge-in.

OpenAI RealtimeGemini LiveElevenLabsAnthropicDeepgram
Explore in detail →
03 · Inference

It starts with one token.

A single piece of the answer, fresh off the model.

A stream off the GPU.

Tokens leave the GPU and start arriving immediately.

Straight to your client.

Point your OpenAI-compatible client at the line — nothing in your app changes.

First token, fast.

Same SSE wire your client already speaks — the moment a token exists, your app has it.

Explore in detail →
meet.example.com/abc-xyz
you
alex
jordan
sam
04 · Browser · WebRTC

Audio, video, text and control.

Four modalities, in one place.

Each on its own lane.

A stall in one never blocks another — no head-of-line blocking.

All on one connection.

One line, nothing to glue together.

On your transport.

Your path end to end — no plugins, no native app.

Explore in detail →
05 · SIP · Telephony

It starts with one call.

A carrier hands you an inbound call.

Then straight to your agent.

Media rides QUIC direct to the agent — no extra hops in the path.

Pull a human onto the line.

Listen, whisper, or take over — over the same connection, without rerouting the call.

One call, the whole wallboard.

Fan it out to every supervisor and dashboard, live.

TwilioVapiLiveKitDailyChimeTata IMSAudiocodes
Explore in detail →
06 · Robotics

One robot, in control.

Onboard autonomy stays on its own network.

Its data, in realtime.

A live mirror of everything it senses.

Up to the cloud.

Reaches the cloud over QUIC — observe from anywhere.

500

live tracks per shard at 50 fps, zero loss.

ROS 2ZenohMQTTDDS
Explore in detail →
07 · Games

A player speaks.

Input on its own fast lane.

Two lanes, no waiting.

Fast input and rich voice, each tuned to what it needs.

The NPC reacts.

Voice and lip-sync arrive together — and it can be interrupted.

Every character, live.

Per-track priority across the whole scene.

Explore in detail →
Six modalities converge on one substratevoiceinferencewebrtcsiproboticsgames
08 · One substrate

Six modalities. One substrate.

Every branch flows back into the same transport line.

0

payload types, one transport. Add a modality, not a stack.

Drop-in SDKs

Ship on the substrate in an afternoon.

Idiomatic clients for the languages your stack already speaks.

py

Python

pip install telequick — async client, drop-in for the realtime API.

ts

TypeScript

Browser + Node, with a helper for browser media over your transport.

++

C++ / Rust

Bare-metal bindings to the core. Zero-copy frame handoff.

# realtime voice in 6 lines
from telequick import Session
s = Session(provider="openai-realtime")
async for ev in s.connect("+1..."):
    if ev.bargein: s.cancel()   # instant
    play(ev.audio)

Coming from another transport? The migration guide maps your existing calls onto the substrate one site at a time.

Read the migration guide
Pricing

Custom-quoted to your workload.

Usage-based on egress — per GB / month — with volume tiers and committed-use discounts. We quote against the actual shape of your traffic, not a list-price grid.

New engagements start with a free production trial: ship real traffic onto the substrate, see your own histograms, then we size a contract to it. No procurement gate to start measuring.

FAQ

Questions builders ask.

No. The substrate exposes the realtime + telephony interfaces your code already speaks. You point your existing client at our endpoint and keep the rest. There's a drop-in shim for OpenAI Realtime, a SIP listener for carrier traffic, and idiomatic SDKs in 9 languages when you want native types.

Interruption detection lives in the transport, not in each provider's SDK. The moment caller audio crosses the speech-gate threshold we hold the agent's output stream and emit a cancel — same code path whether the model behind it is OpenAI Realtime, Gemini Live, or a self-hosted llama.cpp. The tail you hear after you start talking is one packet, not a sentence.

WebSocket sits on TCP, so a single dropped packet head-of-line-blocks every subsequent frame on the same socket — your agent's audio stalls behind your text. QUIC streams are independent: a lost packet only delays its own stream. On a 3% loss link in our DevTools benchmarks, the inter-arrival p95 widens by ~6× on WS and ~1.4× on QUIC. The histograms are on the demo page if you want to look. (For a deeper dive on QUIC's loss-recovery story, see the Google QUIC team's writeups.)

WebTransport is shipped in Chrome and Edge today. Safari and Firefox still trail, and some corporate proxies block UDP. The SDK transparently falls back to WebSocket-over-HTTPS using the same agent and the same fast-cancel barge-in path — you just lose the QUIC head-of-line-blocking benefit on that one client. No code change.

Usage-based on egress (per-GB / month), quoted against your actual workload — not a public list-price grid. Volume tiers and committed-use discounts apply at scale. The infra floor sets the math: one 8-core box carries ~5,000 concurrent sessions at ~$200/month bare-metal, and pricing scales above that with the value added per minute. New engagements start with a free production trial so you size the contract to real numbers, not a forecast.

Yes. The core is a single statically-linked binary plus a Redis. It runs in your own racks — including air-gapped deployments where the model worker is also local — and there's a hard-enforced licensing path for that. Ask us about it during your pilot; we keep our hosted tenants and self-host tenants on the exact same wire so behaviour matches.

LiveKit, Daily, and Chime are SFUs built for many-party video. Their per-participant-minute pricing assumes a meeting room. Twilio Media Streams is PCMU over WebSocket, which gives you carrier reach but not codec choice or sub-200ms barge-in. We're a single-tenant agent transport: one caller talking to one (or many) AI providers, with codec/transport/perception all owned end-to-end. Different shape, different price.

Anything that speaks a streaming-audio or streaming-text API. We ship verified adapters for OpenAI Realtime, Gemini Live, ElevenLabs Conversational AI, and a local llama.cpp worker over QUIC. For STT-only or TTS-only pipelines you can mix providers — Deepgram on input, Cartesia on output, GPT-4o in between — and the substrate handles the joins.

Python, TypeScript, Go, Rust, Java, C#, Swift, Kotlin, and C++ — same API shape, idiomatic types per language. They all wrap one C++ core via FFI, so a wire-protocol fix in core ships to every SDK by re-building. The Python and TypeScript clients see most of the dogfooding; the others are tested against a polyglot smoke matrix on every release.

On the public internet to OpenAI Realtime via our pool, first-token-after-barge-in lands in ~120-180 ms p50 from the moment the caller's first speech-frame arrives, with the long tail dominated by the upstream model not the transport. On localhost the transport itself adds <2 ms over raw UDP. Detailed histograms are exposed in the demo page's post-call summary.

SOC 2 Type II is in audit. HIPAA-eligible BAAs are available on enterprise contracts. Regional residency is supported via single-region deployments (us-east, eu-west, ap-south today) and on-prem covers everything else. We never train on customer audio. Recording is opt-in per tenant, encrypted at rest, with caller-side disclosure macros built into the dialplan.

Self-serve: minutes. Sign up, install the SDK, point it at a phone number we provision, run a call. White-glove for production traffic: a 30-minute call to walk through your existing stack, then a single-region pilot with your real numbers usually inside 48 hours. Migration off your current vendor (Twilio, Vapi, LiveKit) is typically one PR per call site.