Getting started | Speechify API

Getting started quickly with the Speechify AI API involves:

Making the first API request
Determining your use-case and selecting the proper authentication mechanism
Calling the API directly or through our official SDKs

Quickstart

Register your account

Go to Speechify Console and sign up with your email/password or a social auth service.
Navigate to the API Keys section and make sure you have the default API key created for you. If not, create one yourself.

In the Interactive Playground, you can experiment with different voices and audio generation settings to get a sense of what the API is capable of. You can also make a digital clone of your own voice.

Make your first API request

To start talking to our API from code you will need:

The API base URL: https://api.sws.speechify.com/
Your API key
HTTP client of your choice (i.e. curl for shell scripts, fetch for Node.js, etc.)

Check the API Reference for more information about the available endpoints.

Authentication mechanisms

Select the appropriate authentication method based on your application type based on the RFC 6749.

Please refer to our Authentication guide for detailed explanation of authentication mechanisms.

Confidential Clients

Public Clients

Server applications and other confidential clients can use API Keys.

Examples include:

Server-only web applications
Voice agent applications
Internal CLI tools

Implementation options

Direct HTTP calls

Official SDK

If you’re a seasoned web developer familiar with HTTP APIs, refer to our full OpenAPI documentation.

No matter which authentication mechanism you’re using, you’ll pass the API Key or Access Token in the Authorization header of each request:

Authorization: Bearer YOUR_API_KEY_OR_ACCESS_TOKEN

Without a valid header, requests will be met with a 401 Unauthorized status.

Working with SSML

The input parameter of the audio generation endpoints (speech, stream) supports both plain text and SSML.

Plain text (simple use cases)

For basic use cases, you can send plain text:

"Hello, this is Speechify API"

This works but doesn’t provide fine-grained control over speech synthesis.

SSML (advanced control)

For more control, wrap your input in Speech Synthesis Markup Language (SSML):

1 <speak>Your content to be synthesized here</speak>

SSML offers precise control over tone, emphasis, and emotional delivery using tags like <prosody>, <break>, and <emphasis>.

For example, to change speech rate:

1 <speak>
2   <prosody rate="slow">This text will be spoken slowly.</prosody>
3   <prosody rate="fast">This text will be spoken quickly.</prosody>
4 </speak>

For an in-depth exploration of SSML capabilities, visit our SSML documentation.

Cloned voices

This is an advanced feature only available to paying customers.

Speechify provides both standard voices and the ability to create a digitized version of any human voice, including your own.

Browser interface

Experiment with cloned voices directly in your browser using the Speechify Playground.

Upload or record a voice sample, and the new entry will appear in the voice selection menu.

API integration

Create custom voices programmatically via an API call.

Use the resulting voice IDs for speech synthesis.

Happy building with Speechify’s Text-to-Speech API!