Russian Open Speech To Text (STT/ASR) Dataset
Anna Slizhikova and Alexander Veysov and Dmitry Voronin and Yuri Baburov

ru_open_stt_mp3 (25 files)
asr_public_phone_calls_1.csv 35.51MB
asr_public_phone_calls_1_mp3.tar.gz 2.74GB
asr_public_phone_calls_2.csv 83.89MB
asr_public_phone_calls_2_mp3.tar.gz 8.06GB
asr_public_stories_1.csv 6.64MB
asr_public_stories_1_mp3.tar.gz 563.44MB
asr_public_stories_2.csv 10.24MB
asr_public_stories_2_mp3.tar.gz 1.14GB
private_buriy_audiobooks_2.csv 164.23MB
private_buriy_audiobooks_2_mp3.tar.gz 22.07GB
public_lecture_1.csv 925.52kB
public_lecture_1_mp3.tar.gz 93.62MB
public_meta_data_v03.csv 1.47GB
public_series_1.csv 2.71MB
public_series_1_mp3.tar.gz 258.03MB
public_youtube700.csv 104.25MB
public_youtube700_mp3.tar.gz 10.31GB
ru_RU.csv 667.64kB
ru_ru_mp3.tar.gz 230.55MB
russian_single.csv 443.27kB
russian_single_mp3.tar.gz 125.26MB
tts_russian_addresses_rhvoice_4voices.csv 288.19MB
tts_russian_addresses_rhvoice_4voices_mp3.tar.gz 10.65GB
voxforge_ru.csv 957.10kB
voxforge_ru_mp3.tar.gz 251.40MB
Type: Dataset
Tags:Dataset, russian, asr, stt

title= {Russian Open Speech To Text (STT/ASR) Dataset},
journal= {},
author= {Anna Slizhikova and Alexander Veysov and Dmitry Voronin and Yuri Baburov},
year= {},
url= {},
abstract= {v0.4-alpha, added the forgotten txt files

Arguably the largest public Russian STT dataset up to date:
- 4.6m utterances;
- 4000 hours;
- 431 GB (in .wav format in int16);

For more information please go here
keywords= {Dataset, russian, asr, stt},
terms= {},
license= {},
superseded= {}

