Ollama OCR 🔍

A powerful OCR (Optical Character Recognition) package that uses state-of-the-art vision language models through Ollama to extract text from images. Available both as a Python package and a Streamlit web application.

🌟 Features

Multiple Vision Models Support
- LLaVA 7B: Efficient vision-language model for real-time processing (LLaVa model can generate wrong output sometimes)
- Llama 3.2 Vision: Advanced model with high accuracy for complex documents
Multiple Output Formats
- Markdown: Preserves text formatting with headers and lists
- Plain Text: Clean, simple text extraction
- JSON: Structured data format
- Structured: Tables and organized data
- Key-Value Pairs: Extracts labeled information
Batch Processing
- Process multiple images in parallel
- Progress tracking for each image
- Image preprocessing (resize, normalize, etc.)

📦 Package Installation

pip install ollama-ocr

🚀 Quick Start

Prerequisites

Install Ollama
Pull the required model:

ollama pull llama3.2-vision:11b

Using the Package

Single Image Processing

from ollama_ocr import OCRProcessor

# Initialize OCR processor
ocr = OCRProcessor(model_name='llama3.2-vision:11b')  # You can use any vision model available on Ollama

# Process an image
result = ocr.process_image(
    image_path="path/to/your/image.png",
    format_type="markdown"  # Options: markdown, text, json, structured, key_value
)
print(result)

Batch Processing (New! 🆕)

from ollama_ocr import OCRProcessor

# Initialize OCR processor
ocr = OCRProcessor(model_name='llama3.2-vision:11b', max_workers=4)  # max workers for parallel processing

# Process multiple images
# Process multiple images with progress tracking
batch_results = ocr.process_batch(
    input_path="path/to/images/folder",  # Directory or list of image paths
    format_type="markdown",
    recursive=True,  # Search subdirectories
    preprocess=True  # Enable image preprocessing
)
# Access results
for file_path, text in batch_results['results'].items():
    print(f"\nFile: {file_path}")
    print(f"Extracted Text: {text}")

# View statistics
print("\nProcessing Statistics:")
print(f"Total images: {batch_results['statistics']['total']}")
print(f"Successfully processed: {batch_results['statistics']['successful']}")
print(f"Failed: {batch_results['statistics']['failed']}")

📋 Output Format Details

Markdown Format: The output is a markdown string containing the extracted text from the image.
Text Format: The output is a plain text string containing the extracted text from the image.
JSON Format: The output is a JSON object containing the extracted text from the image.
Structured Format: The output is a structured object containing the extracted text from the image.
Key-Value Format: The output is a dictionary containing the extracted text from the image.

🌐 Streamlit Web Application(supports batch processing)

User-Friendly Interface
- Drag-and-drop image upload
- Real-time processing
- Download extracted text
- Image preview with details
- Responsive design

Clone the repository:

git clone https://github.com/imanoop7/Ollama-OCR.git
cd Ollama-OCR

Install dependencies:

pip install -r requirements.txt

Go to the directory where app.py is located:

cd src/ollama_ocr

Run the Streamlit app:

streamlit run app.py

Examples Output

Input Image

Sample Output

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with Ollama Powered by LLaMA Vision Models

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
__pycache__		__pycache__
input		input
output		output
src/ollama_ocr		src/ollama_ocr
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.ipynb		example.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ollama OCR 🔍

🌟 Features

📦 Package Installation

🚀 Quick Start

Prerequisites

Using the Package

Single Image Processing

Batch Processing (New! 🆕)

📋 Output Format Details

🌐 Streamlit Web Application(supports batch processing)

Examples Output

Input Image

Sample Output

📄 License

🙏 Acknowledgments

Star History

About

Releases

Packages

Languages

License

lifejwang11/Ollama-OCR

Folders and files

Latest commit

History

Repository files navigation

Ollama OCR 🔍

🌟 Features

📦 Package Installation

🚀 Quick Start

Prerequisites

Using the Package

Single Image Processing

Batch Processing (New! 🆕)

📋 Output Format Details

🌐 Streamlit Web Application(supports batch processing)

Examples Output

Input Image

Sample Output

📄 License

🙏 Acknowledgments

Star History

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages