Name | DL | Torrents | Total Size |
ru_open_stt_opus (38 files)
manifests/tts_russian_addresses_rhvoice_4voices.csv | 220.26MB |
manifests/radio_v4_manifest.csv | 515.81MB |
manifests/radio_v4_add_manifest.csv | 7.03MB |
manifests/radio_pspeech_sample_manifest.csv | 32.76MB |
manifests/radio_2.csv | 43.04MB |
manifests/public_youtube700_val.csv | 679.15kB |
manifests/public_youtube700.csv | 74.60MB |
manifests/public_youtube1120_hq.csv | 39.34MB |
manifests/public_youtube1120.csv | 141.83MB |
manifests/public_speech_manifest.csv | 132.35MB |
manifests/public_series_1.csv | 1.92MB |
manifests/public_lecture_1.csv | 660.11kB |
manifests/private_buriy_audiobooks_2.csv | 119.40MB |
manifests/buriy_audiobooks_2_val.csv | 744.95kB |
manifests/asr_public_stories_2.csv | 7.19MB |
manifests/asr_public_stories_1.csv | 4.84MB |
manifests/asr_public_phone_calls_2.csv | 60.34MB |
manifests/asr_public_phone_calls_1.csv | 26.39MB |
manifests/asr_calls_2_val.csv | 1.05MB |
archives/tts_russian_addresses_rhvoice_4voices.tar.gz | 13.86GB |
archives/radio_v4_manifest.tar.gz | 189.01GB |
archives/radio_v4_add_manifest.tar.gz | 3.04GB |
archives/radio_pspeech_sample_manifest.tar.gz | 12.27GB |
archives/radio_2.tar.gz | 26.45GB |
archives/public_youtube700_val.tar.gz | 469.33MB |
archives/public_youtube700.tar.gz | 13.09GB |
archives/public_youtube1120_hq.tar.gz | 5.31GB |
archives/public_youtube1120.tar.gz | 20.43GB |
archives/public_speech_manifest.tar.gz | 50.94GB |
archives/public_series_1.tar.gz | 319.23MB |
archives/public_lecture_1.tar.gz | 122.51MB |
archives/private_buriy_audiobooks_2.tar.gz | 27.74GB |
archives/buriy_audiobooks_2_val.tar.gz | 496.48MB |
archives/asr_public_stories_2.tar.gz | 1.50GB |
archives/asr_public_stories_1.tar.gz | 719.09MB |
archives/asr_public_phone_calls_2.tar.gz | 10.12GB |
archives/asr_public_phone_calls_1.tar.gz | 3.41GB |
archives/asr_calls_2_val.tar.gz | 805.25MB |
Type: Dataset
Tags: Dataset, russian, asr, stt, TTS
Bibtex:
Tags: Dataset, russian, asr, stt, TTS
Bibtex:
@article{, title= {OPUS Russian Open Speech To Text Dataset v1.01}, journal= {}, author= {Anna Slizhikova and Alexander Veysov and Dilyara Nurtdinova and Dmitry Voronin}, year= {}, url= {https://github.com/snakers4/open_stt/}, abstract= {v1.0-beta Arguably the largest public Russian STT dataset up to date: 15m utterances; 20 000 hours; 2.3 TB (in mono .wav format in int16); For more information please visit https://github.com/snakers4/open_stt/}, keywords= {Dataset, russian, asr, stt, TTS}, terms= {https://github.com/snakers4/open_stt/#license}, license= {CC-NC-BY}, superseded= {} }