Getting started
Learn how to set up and make your first request with the Speechify AI API
Getting started quickly with the Speechify AI API involves:
- Making the first API request
- Determining your use-case and selecting the proper authentication mechanism
- Calling the API directly or through our official SDKs
Quickstart
Register your account
- Go to Speechify Console and sign up with your email/password or a social auth service.
- Navigate to the API Keys section and make sure you have the default API key created for you. If not, create one yourself.
In the Interactive Playground, you can experiment with different voices and audio generation settings to get a sense of what the API is capable of. You can also make a digital clone of your own voice.
Make your first API request
To start talking to our API from code you will need:
- The API base URL:
https://api.sws.speechify.com/
- Your API key
- HTTP client of your choice (i.e.
curl
for shell scripts,fetch
for Node.js, etc.)
Check the API Reference for more information about the available endpoints.
Authentication mechanisms
Select the appropriate authentication method based on your application type based on the RFC 6749.
Please refer to our Authentication guide for detailed explanation of authentication mechanisms.
Confidential Clients
Public Clients
Server applications and other confidential clients can use API Keys.
Examples include:
- Server-only web applications
- Voice agent applications
- Internal CLI tools
Implementation options
Direct HTTP calls
Official SDK
If you’re a seasoned web developer familiar with HTTP APIs, refer to our full OpenAPI documentation.
No matter which authentication mechanism you’re using, you’ll pass the API Key or Access Token in the Authorization header of each request:
Without a valid header, requests will be met with a 401 Unauthorized
status.
Working with SSML
The input
parameter of the audio generation endpoints (speech, stream) supports both plain text and SSML.
Plain text (simple use cases)
For basic use cases, you can send plain text:
This works but doesn’t provide fine-grained control over speech synthesis.
SSML (advanced control)
For more control, wrap your input in Speech Synthesis Markup Language (SSML):
SSML offers precise control over tone, emphasis, and emotional delivery using tags like <prosody>
, <break>
, and <emphasis>
.
For example, to change speech rate:
For an in-depth exploration of SSML capabilities, visit our SSML documentation.
Cloned voices
This is an advanced feature only available to paying customers.
Speechify provides both standard voices and the ability to create a digitized version of any human voice, including your own.
Experiment with cloned voices directly in your browser using the Speechify Playground.
Upload or record a voice sample, and the new entry will appear in the voice selection menu.
Create custom voices programmatically via an API call.
Use the resulting voice IDs for speech synthesis.
Happy building with Speechify’s Text-to-Speech API!