This project provides an intelligent solution for automating the extraction and organization of data from receipt images. Designed for businesses and individuals, it simplifies the often tedious process of manual data entry, allowing users to focus on insights and decision-making. This application uses advanced OCR (Optical Character Recognition) technology and AI-driven processing to convert receipt images into well-organized, actionable data.
Inspired by @IAmTomShaw's Receipt Vision.
- Receipt Image Upload: Users can upload images of receipts for automatic processing.
- OCR Technology: Text is extracted from the images using Tesseract OCR.
- AI-Powered Parsing: OpenAI's GPT-4 processes the extracted text into structured JSON data.
- Database Integration: Parsed data is stored in a MySQL database for persistence.
- Web Interface: A responsive interface to upload, view, and interact with receipts.
- Receipt Details: View breakdowns of individual receipts, including product details, prices, and categories.
- Backend: Flask (Python)
- Frontend: HTML, CSS, Bootstrap, JavaScript
- Database: MySQL
- AI Integration: OpenAI GPT-4 API
- OCR: Tesseract
- Python 3.10 or higher
- MySQL Server (8.0+)
- OpenAI API Key
-
Clone the repository to your local machine:
git clone https://github.com/JustCabaret/receipt-parser.git cd receipt-parser
-
Set up the virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Create the MySQL database:
- Run the following SQL script on your MySQL server:
CREATE DATABASE receiptparserdb; USE receiptparserdb; CREATE TABLE receipts ( id INT AUTO_INCREMENT PRIMARY KEY, total INT NOT NULL, source VARCHAR(255) NOT NULL, parsed_at DATETIME DEFAULT CURRENT_TIMESTAMP ); CREATE TABLE receipt_items ( id INT AUTO_INCREMENT PRIMARY KEY, receipt_id INT NOT NULL, product VARCHAR(255) NOT NULL, quantity INT NOT NULL, price INT NOT NULL, category VARCHAR(50) DEFAULT NULL, FOREIGN KEY (receipt_id) REFERENCES receipts(id) );
- Run the following SQL script on your MySQL server:
-
Update the database credentials: Edit
database.py
with your MySQL connection details:mysql.connector.connect( host="127.0.0.1", user="your_username", password="your_password", database="receiptparserdb" )
-
Run the application:
python app.py
The server will run at
http://127.0.0.1:5000
.
- Access the application: Open
http://127.0.0.1:5000
in your browser. - Upload Receipts: Use the upload interface to submit receipt images.
- View Receipts: Processed receipts will appear in the main table with their total and store name.
- View Details: Click on any receipt to view its products, quantities, prices, and categories.
- POST /process_receipt: Uploads a receipt image and processes it.
- Input: Image file and OpenAI API key
- Output: Processed receipt data in JSON format
- GET /receipts: Retrieves all processed receipts.
- GET /receipts/{receipt_id}: Retrieves detailed information for a specific receipt.
{
"total": 1250,
"store": "SuperMart",
"items": [
{"product": "Milk", "quantity": 2, "price": 250, "type": "Groceries"},
{"product": "Bread", "quantity": 1, "price": 150, "type": "Groceries"}
]
}
Feel free to fork the project, submit pull requests, report bugs, or suggest new features.
- JustCabaret - Your GitHub Profile
This project is licensed under the MIT License - see the LICENSE file for details.