For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
ExamplesConsole
OverviewText to SpeechAPI ReferenceChangelog
OverviewText to SpeechAPI ReferenceChangelog
LogoLogo
ExamplesConsole
On this page
  • Your first request
  • Set up
  • Build with TTS
  • Models and languages
  • Resources

Text to Speech API

Lifelike speech in 50+ languages from a single API call. Stream long-form audio, clone any voice from a 10-30 second sample, and control delivery with SSML.

Was this page helpful?
Built with

Your first request

Python
TypeScript
cURL
1from speechify import Speechify
2
3client = Speechify() # reads SPEECHIFY_API_KEY from the environment
4response = client.tts.audio.speech(
5 input="Welcome to Speechify.",
6 voice_id="george",
7 audio_format="mp3",
8)
9
10with open("welcome.mp3", "wb") as f:
11 f.write(response.audio_data)
Grab a key at console.speechify.ai/api-keys and set SPEECHIFY_API_KEY in your environment. Then walk through the Quickstart for the full five-minute tour.

Set up

Install an SDK

pip install speechify-api for Python, npm install @speechify/api for TypeScript. Both read SPEECHIFY_API_KEY from the environment automatically.

Authenticate

A single Authorization: Bearer key works for every endpoint. Manage and rotate keys in the console.

Build with TTS

Streaming

Start playback before the full audio is generated. Up to 20,000 characters per request.

Voice cloning

Clone any voice from a 10-30 second sample. Cloned voices work across every supported language.

SSML and emotion

Fine-grained control over pitch, rate, pauses, emphasis, and 13 emotion presets.

Speech marks

Word-level timestamps for highlighting, captions, and audio-text sync.

Models and languages

Two models cover every use case. simba-english is the flagship English model: highest quality, lowest streaming latency, and full SSML + emotion control. simba-multilingual handles 50+ languages with mixed-language input - the same voice IDs work across every language, no separate cloning required.

See Models and Language Support for the full matrix.

Resources

API Reference

Full endpoint schemas, parameters, and response shapes.

Examples

End-to-end demo projects on GitHub.

Console

Manage API keys, voices, and billing.