Language Support
As of now, Speechify Text-to-Speech Models have support for synthesizing speech for the following input languages:
- English (en)
- French (fr-FR)
- German (de-DE)
- Spanish (es-ES)
- Portuguese (pt-BR)
- Portuguese (pt-PT)
The following languages are currently live in beta (we're actively improving them and welcome feedback):
- Arabic (ar-AE)
- Danish (da-DK)
- Dutch (nl-NL)
- Estonian (et-EE)
- Finnish (fi-FI)
- Greek (el-GR)
- Hebrew (he-IL)
- Hindi (hi-IN)
- Italian (it-IT)
- Japanese (ja-JP)
- Norwegian (nb-NO)
- Polish (pl-PL)
- Russian (ru-RU)
- Swedish (sv-SE)
- Turkish (tr-TR)
- Ukrainian (uk-UA)
- Vietnamese (vi-VN)
We will soon have these languages live as well:
- Belarusian (be-BY)
- Bengali (bn-IN)
- Bulgarian (bg-BG)
- Cantonese (zh-HK)
- Catalan (ca-ES)
- Croatian (hr-HR)
- Czech (cs-CZ)
- Filipino (fil-PH)
- Georgian (ka-GE)
- Gujarati (gu-IN)
- Hungarian (hu-HU)
- Indonesian (id-ID)
- Japanese (ja-JP)
- Korean (ko-KR)
- Malay (ms-MY)
- Mandarin (zh-CH)
- Marathi (mr-IN)
- Nepali (ne-NP)
- Persian (fa-IR)
- Romanian (ro-RO)
- Serbian (sr-RS)
- Slovak (sk-SK)
- Tamil (ta-IN)
- Telugu (te-IN)
- Thai (th-TH)
- Urdu (ur-PK)
We're actively working on expanding the list, and will update this document as new languages are added to the platform.
Speechify capable of both handling texts written in a single language, as well as the mixed language inputs.
API support for the language
param
language
paramOur speech synthesis endpoints (/v1/audio/speech and /v1/audio/stream) support the optional language
parameter, that, if specified, should follow the locale naming standard, i.e. en-US, or fr-FR.
The recommendations for using this format are different, depending on whether you can certainly tell the input language.
If you know the input language, and the entire text is 100% written in the same language, then you should generally provide the language parameter, and it will result in the better audio quality.
If, on contrary, you don't know the input language for sure (i.e. it's coming from the user submission), or the text is in the mixed language (i.e. from a book full of foreign quotes), then we recommend avoiding the language
param, and letting the Speechify models infer the language(s) from the input itself.
Voice Cloning
There are no limitations on the language of the voice sample when it comes to voice cloning. Speechify should be able to produce the high quality cloned voice from a short sample (we recommend keeping it around 1 minute of speech), and later use the same voice for synthesizing the speech from the input in any supported language.
Updated about 2 months ago