On the Artificial Analysis leaderboard, the model reaches an Elo score of 1,211 and is rated as offering a particularly strong price-to-quality ratio. In overall quality, it ranks ahead of ElevenLabs v3 and just behind Inworld 1.5 Max.

Image: Google
Image: Google

Gemini 3.1 Flash TTS includes a free tier, where Google may use the data for product improvement. In the paid tier, text input costs $1.00 per million tokens and audio output costs $20.00 per million tokens. In batch mode, pricing drops to $0.50 per million tokens for text input and $10.00 per million tokens for audio output. With the paid tier, data is not used for product improvement.

Gemini 3.1 Flash TTS is available now in preview through the Gemini API, Vertex AI for enterprise customers, and Google Vids for Workspace users. It can also be tested for free in Google AI Studio. All generated audio files are marked with Google’s SynthID watermark to help identify AI-generated content.

Google is positioning Gemini 3.1 Flash TTS as a strong developer-focused alternative in the AI voice market by combining expressive output, controllable speech parameters, and aggressive pricing. Its multilingual support, multi-speaker capability, and built-in watermarking make it especially relevant for scalable enterprise and media workflows.