Skip to content

Danimoz/igbo-ocr

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 

Repository files navigation

igbo-ocr

OCR for Igbo Language

This is a minimal API skeleton that utilizes the Tesseract OCR engine trained for the Igbo language to extract text from an image.

Installing Dependencies

You'll need to install the following

Language Data

Tesseract expects some configuration data. To fetch them:

make tesseract-langdata

(This step is only needed once and already included implicitly in the training target, but you might want to run explicitly it in advance.)

Training

According to ISO 639 codes, https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes, Our model name is ibo From the Tesseract training documentation, to train, we run

make training MODEL_NAME=ibo

About

OCR for Igbo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 84.0%
  • Makefile 16.0%