Platform

Everything between the
ring and the hang-up

Aura owns the entire stack — perception, reasoning and action — so your agents are fast, grounded, and able to actually get things done on a call.

01 · Real-time

A voice pipeline tuned to the millisecond

Streaming everything. The caller hears the first syllable of the reply before the model has finished generating it — that's where the sub-600ms feel comes from.

Voice activity detection

Silero VAD decides the exact instant a caller stops speaking — no awkward two-second wait.

Streaming STT & TTS

Deepgram interim transcripts in, ElevenLabs μ-law audio out — both streamed, never buffered to completion.

Barge-in & backchannels

Callers can interrupt naturally; the bot drops what it was saying. "Mm-hmm" backchannels signal it's still listening.

Latency budget per turn

VAD endpoint80–150ms

STT final50–120ms

RAG (parallel)60–150ms

LLM first token150–250ms

TTS first chunk80–200ms

Target: <600ms user-stop → first audio byte

02 · The Brain

Personas that know your business and act on it

Above the raw STT-LLM-TTS plumbing sits the Brain: persona, intent, retrieval, tools, memory and workflows working as one.

Configurable personas

Each agent has a system prompt, greeting, voice, language and allowed tools — version-managed, hot-editable.

RAG over pgvector

Per-agent and tenant-wide knowledge with cosine top-k retrieval, speculative pre-fetch and Redis caching for follow-ups.

Tools with human approval

Agents call your APIs mid-call. Sensitive actions queue an approval card for one-click accept/reject in the dashboard.

Caller & within-call memory

Cross-call profiles greet returning callers warmly; rolling summaries keep long calls inside the context window.

03 · The full stack

More than 30 capabilities, production-ready

Each is battle-tested in live, high-volume production deployments.

Inbound & outbound

Accept calls on assigned numbers routed by DID, or fire outbound campaigns over REST — same persona, same pipeline.

Live transcripts

Every turn streams to the dashboard over SSE/WebSocket. Watch, flag and QA calls as they happen.

Post-call summaries

Automatic summaries, sentiment and outcomes generated the moment a call ends — ready for your CRM.

40+ languages

Mid-call code-switch detection hot-swaps STT, LLM and TTS to the caller's language without dropping the thread.

Custom voice clones

Upload reference samples to clone a brand voice, or pick from a catalog of provider voices per agent.

Smart routing & fallback

The router picks the best LLM/STT/TTS per turn by capability, cost and health — circuit-breaking to fallbacks mid-call.

Warm escalation

When turn counts cross a threshold, agents warm-transfer the live call to a human escalation number.

Automated voice tests

Define test suites and replay scenarios to catch regressions before they reach a real caller.

Embeddable web widget

Drop a script tag for browser-side WebRTC calls — let visitors talk to an agent right from your site.

04 · Trust & control

Enterprise security, multi-tenant by design

Field-level encryption

Twilio credentials and tool secrets are Fernet-encrypted at rest.

Three auth schemes

JWT users, scoped X-API-Key integrators, and validated Twilio signatures.

GDPR export & erase

One-click data export and right-to-erasure, with an append-only audit trail.

Observability

Prometheus metrics, OpenTelemetry traces and Sentry with PII scrubbing.

See the architecture →

See the platform on a real call

We'll build an agent on your knowledge base and call you live in the demo.

Book a demo View pricing

Everything between thering and the hang-up