DATASET

Collection and transcription of audio recordings in Russian. The dataset solves the problems of recognition and synthesis of toponyms in Russian.

Russian Language Speech Recognition

Gathering data for subsequent annotation
Data Collection
The technology of processing human speech into text format
ASR
The ability of a system to understand, analyse and interpret human's languages
NLP
Process of converting human speech into text
Transcription
168 200
audio
8 411
toponyms
1 200
people
Technical specifications
Each text is voiced by 20 different people from the Commonwealth of Independent States
The average age of the voicing person is 38 years ± 11 years
Gender distribution among voice actors - 30% of men and 70% of women
Data format in the audio package: wav, 16 kHz
Metadata in XLSX file
Каждая аудиозапись в датасете имеет следующие атрибуты:
Gender:
Gender of the speaker (male or female), with 30% of recordings by males and 70% by females
Age:
Age of the speaker in years, with an average age of 38 and a standard deviation of 11 years
Speaker Identifier: Unique identifier for the person who pronounced the object
Transcription:
Text representation of the city or street name in Russian
Are you looking for more datasets with audio recordings and texts in different languages? Leave a request and we will send you more information!
Tell us about your project!
See other datasets