Voice Cloning

Create custom voices from short audio samples

Overview

Clone any voice from a 10-30 second audio sample. The cloned voice captures accent, speaking style, and tone. There’s no limit to the number of voices you can create.

Requirements

ParameterRequirement
Duration10-30 seconds (under 1 minute)
File sizeUnder 5MB
QualityClear speech, minimal background noise
LanguageAny supported language

Use a good microphone and a quiet room. The clone quality depends directly on sample quality.

Create a cloned voice

1voice = client.tts.voices.create(
2 name="My Voice",
3 sample=open("voice-sample.mp3", "rb"),
4 consent=True,
5)
6
7print(f"Created voice: {voice.id}")

You must confirm consent that the voice belongs to you or someone you represent.

Use the cloned voice

Pass the voice ID from the creation response to any speech endpoint:

1response = client.tts.audio.speech(
2 input="This is my cloned voice speaking.",
3 voice_id=voice.id,
4 audio_format="mp3",
5)

Cloned voices also appear in the voices list alongside Speechify’s built-in voices.

Delete a voice

Remove voices you no longer need:

1client.tts.voices.delete(voice.id)

Console UI

You can also clone voices through the Speechify Console without writing code:

Upload a sample

Import an existing audio file containing the voice

Record a sample

Record directly from your browser

Sample recording tips

If you need text to read while recording, try this:

Listening is like riding a storytelling rollercoaster, where you can lean back and enjoy the ride without having to steer. The speaker’s voice becomes your trusty guide, leading you through twists and turns. It’s like having a personal audiobook adventure just for you! So, buckle up, and let the fun begin!

For best results:

  • Speak naturally at a consistent pace
  • Avoid whispering or shouting
  • Minimize pauses longer than 2 seconds
  • Record in a quiet environment without echo