API: Streaming endpoint no longer returns WAV audio

As streaming audio is usually a latency-sensitive operation, and because the WAV format is not naturally suitable for streaming (due to how the file header is organized), we have decided to remove the WAV audio format from the streaming endpoint. This will allow us to focus on the more popular and streaming-friendly audio formats, such as MP3, OGG, and AAC.

API: loudness normalization option

By default, Speechify AI API does loudness normalization of the synthesize audio across different models and voices. While this can be a valuable feature for the multi-voice apps, it inevitably adds a slight delay to the audio generation process.

For that, we’re introducing the new options param to the /v1/audio/speech and /v1/audio/stream APIs, with a single nested property, loudness_normalization (boolean).

The options param may be expanded in the future for the more fine-grained control over the audio generation process.

UI: Ongoing redesign

As our product kept evolving, we realized the necessity of a more consistent and user-friendly design for the Speechify AI API dashboard. We have been working on a redesign of the user interface to make it more accessible and user-friendly. The new design includes a more modern and cleaner look, as well as better accessibility and usability features. Please let us know if you have any feedback or suggestions for further improvements.

We have finished redesigning the major parts of the dashboard, such as the navigation menu, text to speech page, voice cloning page (used to be part of the single Playground), and the API key management page. We are currently working on redesigning the remaining parts of the dashboard, such as the billing section.