An automated document analyzer for Paperless-ngx using OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2) to automatically analyze and tag your documents.
It features: Automode, Manual Mode, Ollama and OpenAI, a Chat function to query your documents with AI, a modern and intuitive Webinterface.
paperless-ai makes changes to the documents in your productive paperlessNGX instance that cannot be easily undone. Do the configuration carefully and think twice. Please test the results beforehand in a separate development environment and be sure to back up your documents and metadata beforehand.
π Thank you for all your support, bug submit, feature requests π
-
π Automatic document scanning in Paperless-ngx
-
π€ AI-powered document analysis using OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2)
-
π·οΈ Automatic title, tag and correspondent assignment
- π·οΈ Predefine what documents will be processed based on existing tags (optional). π
- π Choose to only use Tags you want to be assigned. π
- THIS WILL DISABLE THE PROMPT DIALOG!
- βοΈ Choose if you want to assign a special tag (you name it) to documents that were processed by AI. π
-
π¨ Manual mode to do analysing by hand with help of AI. π
- π Easy setup through web interface
- π Document processing dashboard
- π Automatic restart and health monitoring
- π‘οΈ Error handling and graceful shutdown
- π³ Docker support with health checks
- Docker and Docker Compose
- Access to a Paperless-ngx installation
- OpenAI API key or your own Ollama instance with your chosen model running and reachable.
- Basic understanding of cron syntax (for scan interval configuration)
Visit the Wiki for installation:
Click here for Installation
\
-
Document Discovery
- Periodically scans Paperless-ngx for new documents
- Tracks processed documents in a local SQLite database
-
AI Analysis
- Sends document content to OpenAI API or Ollama for analysis
- Extracts relevant tags and correspondent information
- Uses GPT-4o-mini or your custom Ollama model for accurate document understanding
-
Automatic Organization
- Creates new tags if they don't exist
- Creates new correspondents if they don't exist
- Updates documents with analyzed information
- Marks documents as processed to avoid duplicate analysis
You can now manually analyze your files by hand with the help of AI in a beautiful Webinterface.
Reachable via the /manual
endpoint from the webinterface.
The application can be configured through the Webinterface on the /setup
Route.
You dont need/can't set the environment vars through docker.
The application comes with full Docker support:
- Automatic container restart on failure
- Health monitoring
- Volume persistence for database
- Resource management
- Graceful shutdown handling
# Start the container
docker-compose up -d
# View logs
docker-compose logs -f
# Restart container
docker-compose restart
# Stop container
docker-compose down
# Rebuild and start
docker-compose up -d --build
The application provides a health check endpoint at /health
that returns:
# Healthy system
{
"status": "healthy"
}
# System not configured
{
"status": "not_configured",
"message": "Application setup not completed"
}
# Database error
{
"status": "database_error",
"message": "Database check failed"
}
The application includes a debug interface accessible via /debug
that helps administrators monitor and troubleshoot the system's data:
- π View all system tags
- π Inspect processed documents
- π₯ Review correspondent information
- Navigate to:
http://your-instance:3000/debug
- The interface provides:
- Interactive dropdown to select data category
- Tree view visualization of JSON responses
- Color-coded data representation
- Collapsible/expandable data nodes
Endpoint | Description |
---|---|
/debug/tags | Lists all tags in the system |
/debug/documents | Shows processed document information |
/debug/correspondents | Displays correspondent data |
The debug interface also integrates with the health check system, showing a configuration warning if the system is not properly set up.
To run the application locally without Docker:
- Install dependencies:
npm install
- Start the development server:
npm run test
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
- Store API keys securely
- Restrict container access
- Monitor API usage
- Regularly update dependencies
- Back up your database
This project is licensed under the MIT License - see the LICENSE file for details.
- Paperless-ngx for the amazing document management system
- OpenAI API
- The Express.js and Node.js communities for their excellent tools
If you encounter any issues or have questions:
- Check the Issues section
- Create a new issue if yours isn't already listed
- Provide detailed information about your setup and the problem
- Support for custom AI models
- Support for multiple language analysis
- Advanced tag matching algorithms
- Custom rules for document processing
- Enhanced web interface with statistics