Instant Voice Cloning
Create custom voices from short audio samples for your text-to-speech applications
Overview
Instant voice cloning is one of Speechify API’s core features. Our system lets you create a voice cloned from a short audio sample and use this cloned voice for speech synthesis requests. There is no limit to the number of voices you can clone.
Voice Sample Requirements
- Duration: 10-30 seconds (keep under one minute)
- File size: Below 5MB for pre-recorded samples
- Quality: Clear speech with minimal background noise
- Language: Must be in one of the supported languages
The cloned voice will capture many aspects of the sample, including accent, speaking style, and tone. Using a good microphone is recommended for optimal results.
If you need sample text for recording, try reading this:
Listening is like riding a storytelling rollercoaster, where you can lean back and enjoy the ride without having to steer.
The speaker’s voice becomes your trusty guide, leading you through twists and turns. It’s like having a personal audiobook adventure just for you!
So, buckle up, and let the fun begin!
API Implementation
Create a cloned voice
Send a POST request to https://api.sws.speechify.com/v1/voices
with the audio data, voice name, and consent confirmation.
You must provide consent that the voice belongs to you or to someone you represent.
Use the cloned voice
The voice will appear in the voices list alongside shared voices provided by Speechify.
Use this voice’s ID as the voice_id
parameter in speech generation endpoints:
Using the Speechify Console
You can clone voices directly through the Speechify Console UI. The console provides two options:
Import an existing audio file containing the voice sample.
Record a new voice sample directly from your browser.
Once created, the voice will appear in your voices list, allowing you to test it immediately.