Instant Voice Cloning | Speechify API

Overview

Instant voice cloning is one of Speechify API’s core features. Our system lets you create a voice cloned from a short audio sample and use this cloned voice for speech synthesis requests. There is no limit to the number of voices you can clone.

Voice Sample Requirements

Optimal Voice Sample

Duration: 10-30 seconds (keep under one minute)
File size: Below 5MB for pre-recorded samples
Quality: Clear speech with minimal background noise
Language: Must be in one of the supported languages

The cloned voice will capture many aspects of the sample, including accent, speaking style, and tone. Using a good microphone is recommended for optimal results.

If you need sample text for recording, try reading this:

Listening is like riding a storytelling rollercoaster, where you can lean back and enjoy the ride without having to steer.

The speaker’s voice becomes your trusty guide, leading you through twists and turns. It’s like having a personal audiobook adventure just for you!

So, buckle up, and let the fun begin!

API Implementation

Create a cloned voice

Send a POST request to https://api.sws.speechify.com/v1/voices with the audio data, voice name, and consent confirmation.

You must provide consent that the voice belongs to you or to someone you represent.

View API Reference

Use the cloned voice

The voice will appear in the voices list alongside shared voices provided by Speechify.

Use this voice’s ID as the voice_id parameter in speech generation endpoints:

Delete the voice (optional)

You can remove personal voices using the voices deletion API.

Using the Speechify Console

You can clone voices directly through the Speechify Console UI. The console provides two options:

Upload Sample

Import an existing audio file containing the voice sample.

Record Sample

Record a new voice sample directly from your browser.

Once created, the voice will appear in your voices list, allowing you to test it immediately.