Skip to main content

Get started

calibrate stt
The interactive UI guides you through the full evaluation process:
  1. Language selection — pick from 10+ supported Indic languages
  2. Provider selection — choose providers (only those supporting your language are shown)
  3. Input directory — path to the directory containing your audio files and reference transcripts
The input directory should have this structure:
/path/to/data/
├── stt.csv
└── audios/
    ├── audio_1.wav
    └── audio_2.wav
The stt.csv file contains the reference transcriptions:
idtext
audio_1Hi
audio_2Madam, my name is Geeta Shankar
All audio files should be in WAV format. The evaluation script expects files at audios/<id>.wav where <id> matches the id column in your CSV.
Refer to the sample dataset for a template.
  1. Output directory — where results will be saved (defaults to ./out)
  2. API keys — enter the API keys for the selected providers
The evaluation runs providers in parallel (max 2 at a time), showing the transcriptions as they are generated.

Output

Once all the providers have completed, it displays a leaderboard measuring key metrics along with bar charts for better visualization.
STT leaderboard
You can also view the generated transcript and metrics for each row of your dataset including the LLM judge score and reasoning.
STT provider outputs

Learn more about metrics

Detailed explanation of all metrics and why using an LLM Judge is necessary

Resources

Integrations

See the full list of supported providers and their configuration options