Voice Agents quickstart | Speechify API

Get your API key

Sign up at console.speechify.ai
Go to API Keys
Copy your default API key (or create a new one)

$ export SPEECHIFY_API_KEY="your-api-key-here"

Install the SDK

The official Python and TypeScript SDKs auto-generate against the same OpenAPI spec — every method below is type-checked and version-pinned. Both read SPEECHIFY_API_KEY from the environment automatically.

Python

TypeScript

cURL

$ pip install speechify-api

Create an agent

An agent bundles a prompt, a voice, and a default LLM. Voice IDs come from the regular /v1/voices catalog — anything that works for TTS works for Voice Agents, including your cloned voices.

Python

TypeScript

cURL

1 from speechify import Speechify
2 
3 client = Speechify()
4 
5 agent = client.tts.agents.create(
6     name="Support Bot",
7     prompt="You are a friendly support agent for a SaaS product. "
8            "Greet callers, answer questions about billing and account "
9            "settings, and transfer to a human if you cannot help.",
10     first_message="Hi, this is Sabrina with support. How can I help today?",
11     voice_id="sabrina",
12     language="en",
13     temperature=0.7,
14 )
15 print(agent.id)

Start a conversation

POST /v1/agents/{id}/conversations provisions a realtime voice session, dispatches the agent, and returns a short-lived access token. The caller connects directly to the session with that token — audio never flows through our server.

Python

TypeScript

cURL

1 session = client.tts.agents.create_conversation(id=agent.id)
2 print(session.url, session.token)  # pass these to your browser/SDK

The response shape:

1 {
2   "conversation": { "id": "…", "agent_id": "…", "status": "pending", "transport": "web", "…": "…" },
3   "room":  "conv_<agent>_<user>_<ts>",
4   "token": "eyJhbGc…",
5   "url":   "wss://…"
6 }

Embed it on your site

The fastest way to hear your agent on an actual web page is the drop-in web component:

1 <script src="https://api.speechify.ai/v1/widget/agents.js"></script>
2 <speechify-agent agent-id="<agent.id>"></speechify-agent>

That’s the whole integration for a public agent. Enable the Public toggle on the agent’s Embed tab in the console, add the site’s origin to the allowlist, and the widget works unauthenticated — your API key stays on the server.

For private agents, mint a session token on your backend and pass it to the widget via session-token + session-url attributes. Full details in Embed.

Test it from the console

The quickest path to hearing the agent without writing any integration at all: open the agent in the console, click Test Call, and talk.

Inspecting conversations

Every turn is streamed to the control plane and persisted with timestamps.

Python

TypeScript

cURL

1 # List recent conversations for this account
2 convs = client.tts.conversations.list()
3 
4 # Fetch one, plus its transcript and post-call evaluation
5 conv = client.tts.conversations.get(conv_id)
6 messages = client.tts.conversations.list_messages(conv_id)
7 evals = client.tts.conversations.list_evaluations(conv_id)

Next steps

Attach tools

Give the agent access to your backend, the caller’s device, or built-in actions like end_call.

Listen for events

Receive conversation.started, conversation.ended, message.created webhooks.

Clone your own voice

Use a custom voice with your agents.

API Reference

Full schemas for /v1/agents, /v1/tools, /v1/conversations.