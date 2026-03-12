Get ready for a revolution in speech-to-text technology! We're thrilled to unveil Voxtral Transcribe 2, a groundbreaking duo of models that will transform the way we interact with audio. With cutting-edge transcription quality, speaker identification, and lightning-fast response times, these models are set to redefine the industry. But here's the real game-changer: Voxtral Realtime, a model designed specifically for live transcription with latency that can go as low as sub-200ms. This opens up a whole new world of possibilities for voice-first applications. And the best part? It's available as open-source under the Apache 2.0 license, giving developers the freedom to innovate.

Let's dive into the highlights. Voxtral Mini Transcribe V2 offers state-of-the-art transcription with speaker identification, context-aware spelling, and precise word timestamps in an impressive 13 languages. Meanwhile, Voxtral Realtime takes the stage for live applications, delivering transcriptions with minimal delay and unlocking a new era of real-time voice experiences.

But here's where it gets even more exciting: Voxtral Mini Transcribe V2 achieves industry-leading accuracy at an incredibly low cost, setting a new standard for price-performance in transcription APIs. And with its open-source nature, Voxtral Realtime can be deployed on edge devices, ensuring privacy and security for sensitive applications.

The model's performance is nothing short of exceptional. In the FLEURS transcription benchmark, it achieves a word error rate of just 2.4 seconds, matching the accuracy of offline models while maintaining an incredibly low latency. And its multilingual capabilities are equally impressive, supporting 13 languages and delivering strong transcription performance across the board.

Voxtral Mini Transcribe V2 introduces a range of powerful features. Speaker diarization allows for precise speaker identification, making it ideal for meeting transcription and interview analysis. Context biasing ensures accurate spellings of names, technical terms, and domain-specific vocabulary, a feature optimized for English with experimental support for other languages. Word-level timestamps enable applications like subtitle generation and audio search, while expanded language support and noise robustness further enhance its capabilities.

The audio playground in Mistral Studio lets you test Voxtral Transcribe 2 instantly, with diarization and timestamps. Upload your audio files, choose your settings, and see the magic unfold.

Voxtral's impact on voice applications is transformative. From meeting intelligence and voice agents to contact center automation and media broadcasting, Voxtral powers a diverse range of industries. With its cost-efficiency and accuracy, Voxtral is set to become the go-to solution for businesses looking to enhance their voice workflows.

And this is the part most people miss: Voxtral's models support GDPR and HIPAA-compliant deployments, ensuring data privacy and security.

So, are you ready to unlock the potential of your voice applications? Voxtral Mini Transcribe V2 is available now via API, while Voxtral Realtime is accessible via API and as open-source weights on Hugging Face. Explore the documentation and join the revolution! We can't wait to see what you build.