Skip to content
// LESSON 01 · FREE PREVIEWUNLOCKED

What is a voice agent, actually?

Strip away the hype — what a voice agent is, what it can do, and where it falls down. Pick a fight with the right problem before you start building.

// LOADING SIGNED URL

The actual definition

A voice agent is software that answers phone calls (or makes them) using an LLM as the conversation brain. Three pieces always fit together:

  1. ASR — Automatic Speech Recognition. Turns the caller's audio into text. Deepgram, Whisper, AssemblyAI.
  2. LLM — A language model. GPT-4o, Claude, etc. Reads the text and produces a response.
  3. TTS — Text-to-speech. Turns the response back into audio. ElevenLabs, Cartesia, OpenAI TTS.

A voice-agent platform (Vapi, Retell, Bland) is the orchestrator that wires these together with a phone provider (Twilio).

The three things voice agents are actually good at

  • Answering after-hours. The bar to beat is "voicemail." A halfway-decent voice agent that captures the caller's intent and the basics is already a 10× win.
  • Qualifying leads. "What service are you calling about, when do you need it, what's your zip?" Three questions, structured output, off to the CRM.
  • Booking appointments. Read available slots from a calendar, suggest two, confirm one, send the calendar invite.

The three things they're bad at

  • Free-form complaints. Anything emotional, anything requiring real empathy, anything where the caller is angry. Route to a human.
  • High-stakes diagnostics. "My AC is making a weird noise" — the agent can capture the symptom, but it should not be diagnosing or quoting.
  • Anything legally regulated. Medical advice, legal advice, financial advice. Hard stop.

Where this course is going

Over the next 11 lessons you'll build a real voice agent on Vapi that:

  • Answers calls on a real phone number
  • Qualifies inbound leads with a 4-question script
  • Books appointments against your Cal.com calendar
  • Pushes transcripts into a Supabase table
  • Triggers an SMS follow-up after every call
// NEXT LESSONPick Your Platform