Children Speech Dataset
Case
Dataset for training a neural network to recognize children's speech for voice assistants and children's versions of applications

Кейс Детская речь

NLP
The system's ability to understand, analyze, and interpret human languages
Machine Learning
The system's ability to automatically interpret data and predict outcomes
ASR
Technology for converting human speech into text format
Data Collection
Gathering suitable data for subsequent labeling
1 000
audio recordings
8 weeks
project duration
Case Description
The dataset consists of 5,000 audio materials, collected through crowdsourcing platforms and an internal team of AI trainers

Audio recordings of children's voices for training a voice assistant. Each child should record 1 video, 6 audios from prepared sentences, and 3 improvisations
Data format:
mp3 and xml - a file with a transcript
APPLICATION AREAS OF THE DATASET
to develop a system for automatic recognition and transcription of children's speech recordings
ASR
for systems for automatically determining age or age category of users
NLP and data classification
for the internal database of LLM services that work with children's audiences
Data collection

Didn't find the necessary information?

Leave a request for consultation with our Account manager to get the price of the dataset!
THE FINAL COST OF THE PROJECT IS INFLUENCED BY:
Scope of work
Markup complexity
Timing
Markup quality
Our data quality guarantee is 95%. When ordering markup with quality above 95%, we offer enterprise solutions
Tell us about your project!