Most inbound leads hit a contact form and wait. By the time you reply, they've already talked to someone else. We wanted to fix that for ourselves — and eventually offer it to clients.
The idea: a voice AI agent that picks up inbound calls, runs a structured intake conversation, and delivers a qualified brief before the first human meeting. Not a chatbot. Not an IVR tree. An actual conversational agent that listens, asks follow-ups, and summarizes what it learns.
Here's how we evaluated the platforms, designed the conversation, and planned the build.
Platform Comparison
We looked at three developer-focused voice AI platforms: Vapi, Bland.ai, and Retell AI. For our use case — low volume, inbound-triggered, context-gathering calls — the key differentiators were latency, setup complexity, LLM flexibility, and true cost per minute.
| Vapi | Bland.ai | Retell AI | |
|---|---|---|---|
| True Cost/Min | ~$0.13–$0.31 | ~$0.09–$0.15 | ~$0.07–$0.20 |
| Pricing Model | BYO everything — platform + STT + TTS + LLM + telephony billed separately | Plan-based tiers + per-minute usage | Pay-as-you-go, bundled voice + LLM + telephony |
| Latency | Sub-500ms | ~800ms | Sub-600ms |
| LLM Support | BYO: OpenAI, Claude, Gemini, custom | Built-in, limited choice | BYO: OpenAI, Claude, custom |
| Setup Complexity | High — full stack assembly | Medium — API-first | Low-Medium — visual builder + API |
| Best For | Max flexibility, deep eng resources | High-volume outbound campaigns | Fastest path to production inbound agent |
Our pick: Retell AI
For a single inbound intake agent, Retell is the best fit. The bundled pricing means no surprise bills from five different vendors. The visual builder plus API means a working agent in days, not weeks. And it supports Claude as the LLM backend, which we already know well.
At low volume — say 30 calls a month averaging 5 minutes each — total cost lands around $17–32/month. That's essentially free for a system that qualifies every inbound lead before you ever pick up the phone.
Runner-up: Vapi. If you later need maximum customization or want to offer white-label voice agents to clients, Vapi's flexibility becomes more valuable at scale.
Conversation Flow Design
The goal is to gather enough context so that by the time you get on the real call, you already know what they need, what their environment looks like, and what level of engagement they're expecting. The tone should be warm and curious — not interrogative.
Transparency that this is an AI. Sets a time expectation. Gets verbal consent to proceed. If they want a human instead, route straight to calendar booking.
Let them talk. The agent uses active listening cues and follows up naturally. If they mention a specific problem, dig deeper. If they're vague, offer structure: "Are you more focused on automating internal operations, or is this customer-facing?"
Not everyone will be technical. If they seem unsure, pivot: "No worries on the technical details — we can dig into that together. Do you have an internal IT team, or would you be looking for end-to-end support?"
Follow up on deadline drivers, who else is involved in the decision, and whether they have a budget range in mind. The budget question is optional — only ask if the conversation naturally goes there.
After confirmation, offer to book a call directly or send a scheduling link. The agent generates a structured brief and attaches it to the calendar invite.
Post-Call Data Pipeline
This is where the real operational value lives. After each call, the agent generates a structured brief that lands in your system before the meeting happens.
1. Transcript
Full call transcript stored automatically. Searchable and referenceable.
2. Structured Summary
AI-generated brief: name, company, pain points, tech stack, timeline, decision makers, budget signals.
3. Delivery
Pushed via webhook to Google Doc, CRM, email, or Slack — wherever you need it before the real call.
4. Calendar
Meeting booked with the structured summary attached to the calendar invite description.
Example: What You'd See Before Your Call
Contact: Sarah Chen, VP of Operations, Greenfield Health (regional healthcare network, ~200 employees)
What they need: Automate patient intake and appointment scheduling calls. Currently handling ~500 calls/day with a 12-person team. Wants to reduce staffing costs while improving after-hours availability.
Current stack: Epic EHR, AWS-based infrastructure, Twilio for existing phone system. Internal IT team of 3.
Timeline: Board presentation in Q2. Wants a proof of concept by mid-April. Decision involves CTO and CFO.
Budget signals: Current call center costs "north of $40K/month." Open to phased approach.
Suggested talking points: HIPAA compliance approach, Epic integration feasibility, ROI model comparing current staffing vs. AI agent deployment, phased rollout starting with after-hours calls.
Interested in a voice AI intake agent for your team?
./start-conversation