FastAPI OCR Application

This project is a FastAPI application that provides OCR (Optical Character Recognition) capabilities for both images and PDFs. It uses Tesseract OCR to extract text and can handle tables in images and PDFs.

Usage

This application features a user interface that allows you to upload images or PDF files and receive the extracted text in JSON format.

1. main.py

This file contains the FastAPI application and defines the routes for handling image and PDF uploads.

Routes

/ocr-image/
- Method: POST
- Description: Accepts an image file and returns the extracted text and tables.
- Request Body: Image file (form-data).
- Response: JSON containing the extracted text and tables.
/ocr-pdf/
- Method: POST
- Description: Accepts a PDF file and returns the extracted text and tables.
- Request Body: PDF file (form-data).
- Response: JSON containing the extracted text and tables.

2. image_processing.py

This module provides functions for OCR and table extraction from images.

Functions

ocr_image(image_data: bytes) -> str
- Takes image data as input and returns the extracted text as a string.
extract_table_from_img(image_data: bytes) -> list
- Takes image data as input and returns a list of extracted table data.

3. pdf_processing.py

This module provides functions for OCR and table extraction from PDF files.

Functions

ocr_pdf(pdf_data: bytes) -> str
- Takes PDF data as input and returns the extracted text as a string.
extract_table_from_pdf(pdf_data: bytes) -> list
- Takes PDF data as input and returns a list of extracted table data.

Features

OCR for Images: Extract text from standard images and images containing tables.
OCR for PDFs: Extract text and tables from PDF files.
UI Design: Separate files for handling image and PDF processing.

Directory Structure

OCR/
├── main.py
├── image_processing.py
├── pdf_processing.py
└── index.html
└── asset/
    ├── img.png
    ├── lorem.pdf
    ├── table_img.jpg
    ├── table_pdf.pdf
└── static/
    └── style.css
└── README. md

Installation Guide

Follow these steps to set up the FastAPI OCR application:

Prerequisites

Ensure you have the following installed on your system:

Python 3.7 or higher
Tesseract OCR (Make sure to note the installation path)

Steps to Install

Clone the Repository: Open your terminal and run the following command to clone the repository
```
git clone https://github.com/dagimgetaw/OCR.git
cd OCR
```
Set Up a Virtual Environment (Optional but Recommended): Create a virtual environment to manage dependencies

python -m venv venv

Activate the virtual environment:

For Windows: venv\Scripts\activate
For macOS/Linux: source venv/bin/activate

Install Requirements: Install the necessary dependencies using pip
```
 pip install -r requirements.txt
```
Tesseract Installation: Make sure Tesseract OCR is installed on your system. You can download it from: Tesseract OCR

After installation, set the tesseract_cmd path in your code: python

For Windows

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

For macOS

pytesseract.pytesseract.tesseract_cmd = '/usr/local/bin/tesseract'

Linux

pytesseract.pytesseract.tesseract_cmd = '/usr/bin/tesseract'

Run the Application: Start the FastAPI application using the following command Inside the directory: uvicorn main:app --host 0.0.0.0 --port 8000
Access the Application: Access the application at http://localhost:8000.

Requirements

Make sure you have the following installed:

Docker
Tesseract OCR (included in the Dockerfile)
FastAPI
OpenCV
pdfplumber
pdf2image

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
.vscode		.vscode
__pycache__		__pycache__
assets		assets
static		static
.gitignore		.gitignore
README.md		README.md
image_processing.py		image_processing.py
index.html		index.html
main.py		main.py
pdf_processing.py		pdf_processing.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastAPI OCR Application

Usage

1. main.py

Routes

2. image_processing.py

Functions

3. pdf_processing.py

Functions

Features

Directory Structure

Installation Guide

Prerequisites

Steps to Install

For Windows

For macOS

Linux

Requirements

License

About

Releases

Packages

Languages

dagimgetaw/OCR

Folders and files

Latest commit

History

Repository files navigation

FastAPI OCR Application

Usage

1. main.py

Routes

2. image_processing.py

Functions

3. pdf_processing.py

Functions

Features

Directory Structure

Installation Guide

Prerequisites

Steps to Install

For Windows

For macOS

Linux

Requirements

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages