Parakeet TDT 0.6B — NVIDIA's state-of-the-art ASR model from the NeMo team — is now available in EveryScribe's Private Transcriber. It's the highest-accuracy model we offer for English and European languages, and it runs entirely inside your browser with no data leaving your device.
What Is Parakeet TDT 0.6B?
Parakeet is a family of 600-million-parameter ASR models developed by NVIDIA's NeMo research team. The "TDT" in the name stands for Token-and-Duration Transducer — a decoder architecture that predicts both the output token and its duration simultaneously, allowing the model to skip blank frames during inference and achieve significantly faster decoding than standard transducers.
The variant we ship — Parakeet TDT 0.6B v3 — is the multilingual version, supporting 25 European languages including English, with strong performance across all of them.
Benchmark Performance
Parakeet TDT 0.6B v3 sets a high bar for accuracy on European language ASR:
| Benchmark | Metric | Score |
|---|---|---|
| LibriSpeech test-clean | WER | 1.93% |
| LibriSpeech test-other | WER | 3.59% |
| MLS Average (25 languages) | WER | 7.83% |
| FLEURS Average | WER | 11.97% |
| 10 dB SNR (noisy conditions) | WER | ~7.12% |
A 1.93% WER on LibriSpeech test-clean puts it among the best-performing publicly available ASR models in the world, period.
25 European Languages with Automatic Detection
Parakeet v3 supports 25 European languages out of the box:
Bulgarian · Croatian · Czech · Danish · Dutch · English · Estonian · Finnish · French · German · Greek · Hungarian · Italian · Latvian · Lithuanian · Maltese · Polish · Portuguese · Romanian · Slovak · Slovenian · Spanish · Swedish · Russian · Ukrainian
Language detection is automatic — you don't need to specify which language your audio is in. Parakeet identifies it internally and adjusts its output accordingly.
What Sets Parakeet Apart
Native punctuation and capitalization. Most ASR models output raw, unpunctuated text that requires a separate post-processing step to become readable. Parakeet produces properly formatted output — punctuation, capitalization, and all — directly from the acoustic model. No extra step needed.
Word-level timestamps. Parakeet provides accurate timestamps at the word, character, and segment level. This is essential for video captioning, legal transcription, and any workflow where you need to know exactly when something was said.
Noise robustness. The model was evaluated at multiple signal-to-noise ratios. At 10 dB SNR — the kind of background noise you'd encounter in a café or open office — it still achieves ~7.12% WER. At 5 dB SNR (loud noise), ~8.23% WER.
FastConformer encoder. Parakeet uses an optimized Conformer architecture with 8× depthwise-separable downsampling, which dramatically reduces the compute required per second of audio without sacrificing accuracy.
CC-BY-4.0 license. Both v2 and v3 are released under Creative Commons Attribution 4.0, making them usable for commercial applications.
640 MB, Runs in Your Browser
The model we ship is a quantized INT8 ONNX version of Parakeet TDT 0.6B v3, totaling 640 MB. It's the largest model in our lineup (aside from Omnilingual and FunASR Nano), which reflects its parameter count and the breadth of language coverage.
Like all models in EveryScribe's Private Transcriber, it runs via WebAssembly in your browser. After the one-time download, transcription happens locally — no audio ever leaves your device.
When to Choose Parakeet TDT 0.6B
Parakeet is the model for you if:
- You need the highest accuracy for English transcription
- Your audio is in a European language — Spanish, French, German, Italian, Polish, Portuguese, etc.
- You're doing professional transcription where punctuation and timestamps matter
- You're working with noisy audio from meetings, events, or outdoor recordings
- You need word-level timestamps for caption or subtitle work
For East Asian languages, SenseVoice is the better choice. For coverage of 1,600+ languages including rare ones, see Omnilingual.
Get Started
Go to everyscribe.com/dashboard/offline-transcriber, select Parakeet TDT 0.6B from the ASR model dropdown, download it once, and start transcribing with top-tier European language accuracy — all without leaving your browser.
Parakeet is developed by NVIDIA's NeMo team. The source code is available on GitHub and the model weights on Hugging Face. We're grateful to NVIDIA for releasing it under a license that allows open, privacy-preserving deployment like EveryScribe.
