How it works

From a ringing phone to a
resolved conversation

Follow a single call through the system — the real-time turn loop, the AI Brain and the workers that pick up after hang-up.

The turn loop

Six stages, mostly in parallel

A turn is everything from a caller finishing a sentence to the agent starting its reply. Here's what happens in those few hundred milliseconds.

LISTEN

Twilio Media Streams pushes 20ms μ-law frames over a WebSocket. VAD + streaming Deepgram STT produce interim transcripts as the caller speaks.

ENDPOINT

When VAD detects silence, a Deepgram utterance-end event — or a one-token "are they done?" classifier — confirms the caller has finished.

THINK

RAG retrieval, caller-profile lookup and intent classification run in parallel, then the smart-routed LLM streams its reply token by token.

SPEAK

Complete clauses are split off and streamed straight into TTS — audio flows back to Twilio before the model finishes the sentence.

ACT

If the model emits a tool call, the dispatcher invokes your API; sensitive actions pause for one-click human approval.

REMEMBER

The turn is broadcast to live viewers and folded into rolling memory. On hang-up, the caller profile is updated for next time.

Parallel timeline
VAD STT RAG Intent LLM TTS caller stops first audio out

RAG and intent overlap STT; TTS starts before the LLM finishes. That overlap is the whole trick.

System architecture

One app, three API surfaces, durable workers

A FastAPI core handles real-time calls; Celery workers handle everything that can happen after hang-up.

External callers Your CRM · 3rd-party apps · Postman Twilio telephony · media streams FastAPI app (Uvicorn) External · Dashboard · Twilio APIs Brain · personas · RAG · actions Voice pipeline ≤600msVAD ↔ STT ↔ LLM ↔ TTS Smart router · cost · health Redis cache + pub/sub Postgres + pgvector LLM / STT / TTS providers Celery workers + beatsummaries · webhooks · ingestion · retries

External API

X-API-Key auth. Integrators place calls and read results over REST.

Dashboard API

JWT auth for your team — agents, knowledge, analytics, billing, audit.

Twilio webhooks

Signature-validated telephony events drive the live media stream.

After the hang-up

The work that happens once the call ends

Heavy lifting moves to durable workers so the live path stays fast and the call's aftermath is reliable.

  • A summary, sentiment and outcome are generated immediately.
  • call.completed fans out to your workflows.
  • Signed webhooks fire — with retries on 5xx/timeout.
  • Recordings are pulled and attached to the call record.
  • Usage events roll up for accurate, real-time billing.
Webhook payload
POST https://your-app.com/hook
X-AURA-Event:     call.completed
X-AURA-Signature: sha256=…

{
  "event": "call.completed",
  "data": {
    "call_sid": "CA…",
    "outcome":  "answered",
    "duration": 142,
    "summary":  "Booked demo…"
  }
}
Built to stay up

Resilient by design

The real-time path degrades gracefully, never abruptly — a provider hiccup is invisible to your caller.

Smart routing

Each turn picks the best LLM/STT/TTS by capability, cost and health — automatically.

Circuit breakers

A provider that errors or slows down is bypassed mid-call to a healthy fallback.

Retried delivery

Webhooks and post-call jobs retry with backoff and survive worker restarts.

Full observability

Prometheus metrics, OpenTelemetry traces and per-stage latency on every call.

Want the deep technical walkthrough?

Our engineers will take your team through the architecture and a live call.

Book a technical demo