DATASET
This dataset is designed to assist with text recognition tasks in different languages.

Text from the covers and labels of goods

Ability of a machine to interpret, analyze, and understand visual data
Computer Vision
The process of identifying objects in photos for training a system to recognize and interpret them
Data labelling
A type of annotation used in computer vision that refers to a box drawn around an object in an image or video
Bounding box
Optical character recognition is a process that converts printed texts into digital image files.
OCR
150+
languages
200+
contries
<10
M types of covers and goods
Technical specifications:
Two types of images with text:
Advertising:
  • names of organizations, posters, billboards, stickers and banners most often filmed on the street

Products:
  • food, cosmetics, personal hygiene items, book covers and video games filmed indoors
Daylight:
  • filmed indoors or outdoors in daylight

Night:
  • filmed in the dark outdoors or indoors
Two types of lighting:
Bounding Box:
  • labelling for each sequence of letters or numbers

OCR:
  • labelling for the selected sequence, including punctuation
Types of data labelling:
Tell us about your project!
See other datasets