Telequick — Transform Your Transport

Drop-in SDKs

Ship on the substrate in an afternoon.

Idiomatic clients for the languages your stack already speaks.

Python

pip install telequick — async client, drop-in for the realtime API.

TypeScript

Browser + Node, with a helper for browser media over your transport.

C++ / Rust

Bare-metal bindings to the core. Zero-copy frame handoff.

# realtime voice in 6 lines
from telequick import Session
s = Session(provider="openai-realtime")
async for ev in s.connect("+1..."):
    if ev.bargein: s.cancel()   # instant
    play(ev.audio)

Coming from another transport? The migration guide maps your existing calls onto the substrate one site at a time.

Read the migration guide →

FAQ

Questions builders ask.

No. The substrate exposes the realtime + telephony interfaces your code already speaks. You point your existing client at our endpoint and keep the rest. There's a drop-in shim for OpenAI Realtime, a SIP listener for carrier traffic, and idiomatic SDKs in 9 languages when you want native types.

Interruption detection lives in the transport, not in each provider's SDK. The moment caller audio crosses the speech-gate threshold we hold the agent's output stream and emit a cancel — same code path whether the model behind it is OpenAI Realtime, Gemini Live, or a self-hosted llama.cpp. The tail you hear after you start talking is one packet, not a sentence.

WebSocket sits on TCP, so a single dropped packet head-of-line-blocks every subsequent frame on the same socket — your agent's audio stalls behind your text. QUIC streams are independent: a lost packet only delays its own stream. On a 3% loss link in our DevTools benchmarks, the inter-arrival p95 widens by ~6× on WS and ~1.4× on QUIC. The histograms are on the demo page if you want to look. (For a deeper dive on QUIC's loss-recovery story, see the Google QUIC team's writeups.)

WebTransport is shipped in Chrome and Edge today. Safari and Firefox still trail, and some corporate proxies block UDP. The SDK transparently falls back to WebSocket-over-HTTPS using the same agent and the same fast-cancel barge-in path — you just lose the QUIC head-of-line-blocking benefit on that one client. No code change.

Usage-based on egress (per-GB / month), quoted against your actual workload — not a public list-price grid. Volume tiers and committed-use discounts apply at scale. The infra floor sets the math: one 8-core box carries ~5,000 concurrent sessions at ~$200/month bare-metal, and pricing scales above that with the value added per minute. New engagements start with a free production trial so you size the contract to real numbers, not a forecast.

Yes. The core is a single statically-linked binary plus a Redis. It runs in your own racks — including air-gapped deployments where the model worker is also local — and there's a hard-enforced licensing path for that. Ask us about it during your pilot; we keep our hosted tenants and self-host tenants on the exact same wire so behaviour matches.

LiveKit, Daily, and Chime are SFUs built for many-party video. Their per-participant-minute pricing assumes a meeting room. Twilio Media Streams is PCMU over WebSocket, which gives you carrier reach but not codec choice or sub-200ms barge-in. We're a single-tenant agent transport: one caller talking to one (or many) AI providers, with codec/transport/perception all owned end-to-end. Different shape, different price.

Anything that speaks a streaming-audio or streaming-text API. We ship verified adapters for OpenAI Realtime, Gemini Live, ElevenLabs Conversational AI, and a local llama.cpp worker over QUIC. For STT-only or TTS-only pipelines you can mix providers — Deepgram on input, Cartesia on output, GPT-4o in between — and the substrate handles the joins.

Python, TypeScript, Go, Rust, Java, C#, Swift, Kotlin, and C++ — same API shape, idiomatic types per language. They all wrap one C++ core via FFI, so a wire-protocol fix in core ships to every SDK by re-building. The Python and TypeScript clients see most of the dogfooding; the others are tested against a polyglot smoke matrix on every release.

On the public internet to OpenAI Realtime via our pool, first-token-after-barge-in lands in ~120-180 ms p50 from the moment the caller's first speech-frame arrives, with the long tail dominated by the upstream model not the transport. On localhost the transport itself adds <2 ms over raw UDP. Detailed histograms are exposed in the demo page's post-call summary.

SOC 2 Type II is in audit. HIPAA-eligible BAAs are available on enterprise contracts. Regional residency is supported via single-region deployments (us-east, eu-west, ap-south today) and on-prem covers everything else. We never train on customer audio. Recording is opt-in per tenant, encrypted at rest, with caller-side disclosure macros built into the dialplan.

Self-serve: minutes. Sign up, install the SDK, point it at a phone number we provision, run a call. White-glove for production traffic: a 30-minute call to walk through your existing stack, then a single-region pilot with your real numbers usually inside 48 hours. Migration off your current vendor (Twilio, Vapi, LiveKit) is typically one PR per call site.

Transform your transport.

It starts with one packet.

One live connection.

One core.

0★

One substrate.

It starts with one sample.

Which becomes speech.

A live call.

It hears the whole room.

Thousands of calls at once.

It starts with one token.

A stream off the GPU.

Straight to your client.

First token, fast.

Audio, video, text and control.

Each on its own lane.

All on one connection.

On your transport.

It starts with one call.

Then straight to your agent.

Pull a human onto the line.

One call, the whole wallboard.

One robot, in control.

Its data, in realtime.

Up to the cloud.

500

A player speaks.

Two lanes, no waiting.

The NPC reacts.

Every character, live.

Six modalities. One substrate.

0★

Ship on the substrate in an afternoon.

Python

TypeScript

C++ / Rust

Custom-quoted to your workload.

Questions builders ask.