Giga-AM v2 Is Now Available on EveryScribe — The Best Open-Source Russian ASR Model

What Is Giga-AM v2?

Why a Dedicated Russian Model Matters

Accuracy: 56% Better Than Whisper Large-v3

Architecture: HuBERT-CTC with Dynamic Chunk Sampling

225 MB, MIT License, Runs in Your Browser

When to Choose Giga-AM v2

Get Started

Giga-AM v2 — SberDevices' dedicated Russian speech recognition model — is now available in EveryScribe's Private Transcriber. At just 225 MB, it delivers best-in-class accuracy for Russian transcription, running entirely inside your browser with no audio uploads required.

What Is Giga-AM v2?

Giga-AM is a family of foundational speech recognition models developed by SberDevices, the AI research division of Sber (Russia's largest financial institution and AI company). The v2 release is a major architectural upgrade over the original, using a HuBERT-CTC self-supervised learning framework on top of a Conformer-based encoder with approximately 220–240 million parameters.

The model was pre-trained on 50,000 hours of diverse Russian speech data, covering a wide range of speakers, accents, recording conditions, and domain vocabulary.

Why a Dedicated Russian Model Matters

General-purpose multilingual models like Whisper do support Russian — but they weren't built specifically for it. Russian presents unique ASR challenges:

Rich morphology: Russian has six grammatical cases, complex agreement patterns, and highly variable word forms. A model not specifically trained on Russian text will produce more errors here than in morphologically simpler languages.
Diverse accent landscape: Russian is spoken across eleven time zones. Regional accent variation is significant. Giga-AM was trained on data collected across this full range.
Domain vocabulary: Technical, medical, legal, and financial Russian has specialized vocabulary that general models handle poorly. Giga-AM's 50,000-hour training corpus includes domain-specific recordings.

Accuracy: 56% Better Than Whisper Large-v3

The headline benchmark tells the story clearly: Giga-AM's strongest variants achieve a 56% improvement in Word Error Rate compared to Whisper Large-v3 on Russian speech.

For the CTC variant we use on EveryScribe, v2 represents a 15% WER reduction over the previous Giga-AM v1 — a significant improvement in an already strong model.

The training approach — HuBERT-CTC self-supervised learning on Russian-specific data — gives the model a deep structural understanding of Russian phonetics that a multilingual model trained on dozens of languages simply cannot match.

Architecture: HuBERT-CTC with Dynamic Chunk Sampling

Giga-AM v2 introduces a novel training approach: dynamic chunk size sampling within a single pre-training run. This allows the same pre-trained model to be fine-tuned for both:

Full-context ASR — for maximum accuracy on pre-recorded audio
Streaming ASR — for lower-latency real-time transcription

On EveryScribe, we use the full-context CTC variant, which gives the highest accuracy for offline files and recordings.

225 MB, MIT License, Runs in Your Browser

The version we ship is a quantized INT8 ONNX export of Giga-AM v2, reduced to 225 MB — the smallest dedicated-language model in our lineup. It runs via WebAssembly in your browser, with no server infrastructure involved. Your audio never leaves your device.

Giga-AM v2 is released under the MIT license, meaning it's free to use for any purpose, including commercial applications.

When to Choose Giga-AM v2

This is the model for you if:

Your audio is primarily or entirely in Russian
You need the highest possible accuracy for Russian transcription
You're transcribing professional, technical, or domain-specific Russian content
You're working with speakers from diverse regional backgrounds in Russia or the post-Soviet world
You want a compact, fast model (225 MB) without sacrificing Russian-specific accuracy

For broader European language coverage including Russian plus 24 other European languages, Parakeet TDT 0.6B is also a strong option.

Get Started

Go to everyscribe.com/dashboard/offline-transcriber, select Giga-AM v2 from the ASR model dropdown, download it once, and transcribe Russian audio with state-of-the-art accuracy — all locally, all privately.

Giga-AM v2 is open-source and available on GitHub and the model weights on Hugging Face. We thank the SberDevices team for their commitment to releasing world-class Russian ASR technology under an open license.

Giga-AM v2 Is Now Available on EveryScribe — The Best Open-Source Russian ASR Model

목차

What Is Giga-AM v2?

Why a Dedicated Russian Model Matters

Accuracy: 56% Better Than Whisper Large-v3

Architecture: HuBERT-CTC with Dynamic Chunk Sampling

225 MB, MIT License, Runs in Your Browser

When to Choose Giga-AM v2

Get Started