- We now have a logo, complete with a favicon :-)
- Removed some problematic tests.
- Fix the docker-compose example config to include a shared consume volume so that using the push API will work for users of the Docker install. Thanks to Colin Frei for fixing this in #466.
- khrise submitted a pull request to include the
added
property to the REST API #471.
- Allow an infinite number of logs to be deleted. Thanks to Ulli for noting the problem in #433.
- Fix the
RecentCorrespondentsFilter
correspondents filter that was added in 2.4 to play nice with the defaults. Thanks to tsia and Sblop who pointed this out. #423. - Updated dependencies to include (among other things) a security patch to requests.
- Fix text in sample data for tests so that the language guesser stops thinking that everything is in Catalan because we had Lorem ipsum in there.
- Tweaked the gunicorn sample command to use filesystem paths instead of Python paths. #441
- Added pretty colour boxes next to the hex values in the Tags section, thanks to a pull request from Joshua Taillon #442.
- Added a
.editorconfig
file to better specify coding style. - Joshua Taillon also added some logic to tie Paperless' date guessing logic into how it parses file names on import. #440
- New dependency: Paperless now optimises thumbnail generation with
optipng, so you'll need to install that somewhere in your PATH or declare
its location in
PAPERLESS_OPTIPNG_BINARY
. The Docker image has already been updated on the Docker Hub, so you just need to pull the latest one from there if you're a Docker user. - "Login free" instances of Paperless were breaking whenever you tried to edit objects in the admin: adding/deleting tags or correspondents, or even fixing spelling. This was due to the "user hack" we were applying to sessions that weren't using a login, as that hack user didn't have a valid id. The fix was to attribute the first user id in the system to this hack user. #394
- A problem in how we handle slug values on Tags and Correspondents required a
few changes to how we handle this field #393:
- Slugs are no longer editable. They're derived from the name of the tag or
correspondent at save time, so if you wanna change the slug, you have to
change the name, and even then you're restricted to the rules of the
slugify()
function. The slug value is still visible in the admin though. - I've added a migration to go over all existing tags & correspondents and
rewrite the
.slug
values to ones conforming to theslugify()
rules. - The consumption process now uses the same rules as
.save()
in determining a slug and using that to check for an existing tag/correspondent.
- Slugs are no longer editable. They're derived from the name of the tag or
correspondent at save time, so if you wanna change the slug, you have to
change the name, and even then you're restricted to the rules of the
- An annoying bug in the date capture code was causing some bogus dates to be attached to documents, which in turn busted the UI. Thanks to Andrew Peng for reporting this. #414.
- A bug in the Dockerfile meant that Tesseract language files weren't being installed correctly. euri10 was quick to provide a fix: #406, #413.
- Document consumption is now wrapped in a transaction as per an old ticket #262.
- The
get_date()
functionality of the parsers has been consolidated onto theDocumentParser
class since much of that code was redundant anyway.
- A new set of actions are now available thanks to jonaswinkler's very first pull request! You can now do nifty things like tag documents in bulk, or set correspondents in bulk. #405
- The import/export system is now a little smarter. By default, documents are
tagged as
unencrypted
, since exports are by their nature unencrypted. It's now in the import step that we decide the storage type. This allows you to export from an encrypted system and import into an unencrypted one, or vice-versa. - The migration history has been slightly modified to accommodate PostgreSQL
users. Additionally, you can now tell paperless to use PostgreSQL simply by
declaring
PAPERLESS_DBUSER
in your environment. This will attempt to connect to your Postgres database without a password unless you also setPAPERLESS_DBPASS
. - A bug was found in the REST API filter system that was the result of an update of django-filter some time ago. This has now been patched in #412. Thanks to thepill for spotting it!
- Support for consuming plain text & markdown documents was added by Joshua Taillon! This was a long-requested feature, and it's addition is likely to be greatly appreciated by the community: #395 Thanks also to David Martin for his assistance on the issue.
- dubit0 found & fixed a bug that prevented management commands from running before we had an operational database: #396
- Joshua also added a simple update to the thumbnail generation process to improve performance: #399
- As his last bit of effort on this release, Joshua also added some code to allow you to view the documents inline rather than download them as an attachment. #400
- Finally, ahyear found a slip in the Docker documentation and patched it. #401
- Kyle Lucy reported a bug quickly after the release of 2.2.0 where we broke
the
DISABLE_LOGIN
feature: #392.
- Thanks to dadosch, Wolfgang Mader, and Tim Brooks this is the first version of Paperless that supports Django 2.0! As a result of their hard work, you can now also run Paperless on Python 3.7 as well: #386 & #390.
- Stéphane Brunner added a few lines of code that made tagging interface a lot easier on those of us with lots of different tags: #391.
- Kilian Koeltzsch noticed a bug in how we capture & automatically create tags, so that's fixed now too: #384.
- erikarvstedt tweaked the behaviour of the test suite to be better behaved for packaging environments: #383.
- Lukasz Soluch added CORS support to make building a new Javascript-based front-end cleaner & easier: #387.
- Enno Lohmeier added three simple features that make Paperless a lot more user (and developer) friendly:
- You now also have the ability to customise the interface to your heart's
content by creating a file called
overrides.css
and/oroverrides.js
in the root of your media directory. Thanks to Mark McFate for this idea: #371
This is a big release as we've changed a core-functionality of Paperless: we no longer encrypt files with GPG by default.
The reasons for this are many, but it boils down to that the encryption wasn't
really all that useful, as files on-disk were still accessible so long as you
had the key, and the key was most typically stored in the config file. In
other words, your files are only as safe as the paperless
user is. In
addition to that, the contents of the documents were never encrypted, so
important numbers etc. were always accessible simply by querying the database.
Still, it was better than nothing, but the consensus from users appears to be
that it was more an annoyance than anything else, so this feature is now turned
off unless you explicitly set a passphrase in your config file.
Encryption isn't gone, it's just off for new users. So long as you have
PAPERLESS_PASSPHRASE
set in your config or your environment, Paperless
should continue to operate as it always has. If however, you want to drop
encryption too, you only need to do two things:
- Run
./manage.py migrate && ./manage.py change_storage_type gpg unencrypted
. This will go through your entire database and Decrypt All The Things. - Remove
PAPERLESS_PASSPHRASE
from yourpaperless.conf
file, or simply stop declaring it in your environment.
Special thanks to erikarvstedt, matthewmoto, and mcronce who did the bulk of the work on this big change.
- Quentin Dawans has refactored the document consumer to allow for some
command-line options. Notably, you can now direct it to consume from a
particular
--directory
, limit the--loop-time
, set the time between mail server checks with--mail-delta
or just run it as a one-off with--one-shot
. See #305 & #313 for more information. - Refactor the use of travis/tox/pytest/coverage into two files:
.travis.yml
andsetup.cfg
. - Start generating requirements.txt from a Pipfile. I'll probably switch over to just using pipenv in the future.
- All for a alternative FreeBSD-friendly location for
paperless.conf
. Thanks to Martin Arendtsen who provided this (#322). - Document consumption events are now logged in the Django admin events log. Thanks to CkuT for doing the legwork on this one and to Quentin Dawans & David Martin for helping to coordinate & work out how the feature would be developed.
- erikarvstedt contributed a pull request (#328) to add
--noreload
to the default server start process. This helps reduce the load imposed by the running webservice. - Through some discussion on #253 and #323, we've removed a few of the hardcoded URL values to make it easier for people to host Paperless on a subdirectory. Thanks to Quentin Dawans and Kyle Lucy for helping to work this out.
- The clickable area for documents on the listing page has been increased to a more predictable space thanks to a glorious hack from erikarvstedt in #344.
- Strubbl noticed an annoying bug in the bash script wrapping the Docker entrypoint and fixed it with some very creating Bash skills: #352.
- You can now use the search field to find documents by tag thanks to thinkjk's first ever issue: #354.
- Inotify is now being used to detect additions to the consume directory thanks to some excellent work from erikarvstedt on #351
- You can now run Paperless without a login, though you'll still have to create
at least one user. This is thanks to a pull-request from matthewmoto:
#295. Note that logins are still required by default, and that you need
to disable them by setting
PAPERLESS_DISABLE_LOGIN="true"
in your environment or in/etc/paperless.conf
. - Fix for #303 where sketchily-formatted documents could cause the consumer
to break and insert half-records into the database breaking all sorts of
things. We now capture the return codes of both
convert
andunpaper
and fail-out nicely. - Fix for additional date types thanks to input from Isaac and code from BastianPoe (#301).
- Fix for running migrations in the Docker container (#299). Thanks to Georgi Todorov for the fix (#300) and to Pit for the review.
- Fix for Docker cases where the issuing user is not UID 1000. This was a collaborative fix between Jeffrey Portman and Pit in #311 and #312 to fix #306.
- Patch the historical migrations to support MySQL's um, interesting way of handing indexes (#308). Thanks to Simon Taddiken for reporting the problem and helping me find where to fix it.
- New Docker image, now based on Alpine, thanks to the efforts of addadi and Pit. This new image is dramatically smaller than the Debian-based one, and it also has a new home on Docker Hub. A proper thank-you to Pit for hosting the image on his Docker account all this time, but after some discussion, we decided the image needed a more official-looking home.
- BastianPoe has added the long-awaited feature to automatically skip the
OCR step when the PDF already contains text. This can be overridden by
setting
PAPERLESS_OCR_ALWAYS=YES
either in yourpaperless.conf
or in the environment. Note that this also means that Paperless now requireslibpoppler-cpp-dev
to be installed. Important: You'll need to runpip install -r requirements.txt
after the usualgit pull
to properly update. - BastianPoe has also contributed a monumental amount of work (#291) to solving #158: setting the document creation date based on finding a date in the document text.
- Fix for #283, a redirect bug which broke interactions with paperless-desktop. Thanks to chris-aeviator for reporting it.
- Addition of an optional new financial year filter, courtesy of David Martin #256
- Fixed a typo in how thumbnails were named in exports #285, courtesy of Dan Panzarella
- Upgrade to Django 1.11. You'll need to run ``pip install -r requirements.txt`` after the usual ``git pull`` to properly update.
- Replace the templatetag-based hack we had for document listing in favour of a slightly less ugly solution in the form of another template tag with less copypasta.
- Support for multi-word-matches for auto-tagging thanks to an excellent patch from ishirav #277.
- Fixed a CSS bug reported by Stefan Hagen that caused an overlapping of the text and checkboxes under some resolutions #272.
- Patched the Docker config to force the serving of static files. Credit for this one goes to dev-rke via #248.
- Fix file permissions during Docker start up thanks to Pit on #268.
- Date fields in the admin are now expressed as HTML5 date fields thanks to Lukas Winkler's issue #278
- Paperless can now run in a subdirectory on a host (
/paperless
), rather than always running in the root (/
) thanks to maphy-psd's work on #255.
- Potentially breaking change: As per #235, Paperless will no longer automatically delete documents attached to correspondents when those correspondents are themselves deleted. This was Django's default behaviour, but didn't make much sense in Paperless' case. Thanks to Thomas Brueggemann and David Martin for their input on this one.
- Fix for #232 wherein Paperless wasn't recognising
.tif
files properly. Thanks to ayounggun for reporting this one and to Kusti Skytén for posting the correct solution in the Github issue.
- Abandon the shared-secret trick we were using for the POST API in favour of BasicAuth or Django session.
- Fix the POST API so it actually works. #236
- Breaking change: We've dropped the use of
PAPERLESS_SHARED_SECRET
as it was being used both for the API (now replaced with a normal auth) and form email polling. Now that we're only using it for email, this variable has been renamed toPAPERLESS_EMAIL_SECRET
. The old value will still work for a while, but you should change your config if you've been using the email polling feature. Thanks to Joshua Gilman for all the help with this feature.
- Support for fuzzy matching in the auto-tagger & auto-correspondent systems thanks to Jake Gysland's patch #220.
- Modified the Dockerfile to prepare an export directory (#212). Thanks to combined efforts from Pit and Strubbl in working out the kinks on this one.
- Updated the import/export scripts to include support for thumbnails. Big thanks to CkuT for finding this shortcoming and doing the work to get it fixed in #224.
- All of the following changes are thanks to David Martin: * Bumped the dependency on pyocr to 0.4.7 so new users can make use of Tesseract 4 if they so prefer (#226). * Fixed a number of issues with the automated mail handler (#227, #228) * Amended the documentation for better handling of systemd service files (#229) * Amended the Django Admin configuration to have nice headers (#230)
- Fix for #206 wherein the pluggable parser didn't recognise files with
all-caps suffixes like
.PDF
- Introducing reminders. See #199 for more information, but the short explanation is that you can now attach simple notes & times to documents which are made available via the API. Currently, the default API (basically just the Django admin) doesn't really make use of this, but Thomas Brueggemann over at Paperless Desktop has said that he would like to make use of this feature in his project.
- Fix for #200 (!!) where the API wasn't configured to allow updating the correspondent or the tags for a document.
- The
content
field is now optional, to allow for the edge case of a purely graphical document. - You can no longer add documents via the admin. This never worked in the first place, so all I've done here is remove the link to the broken form.
- The consumer code has been heavily refactored to support a pluggable interface. Install a paperless consumer via pip and tell paperless about it with an environment variable, and you're good to go. Proper documentation is on its way.
- A serious facelift for the documents listing page wherein we drop the tabular layout in favour of a tiled interface.
- Users can now configure the number of items per page.
- Fix for #171: Allow users to specify their own
SECRET_KEY
value. - Moved the dotenv loading to the top of settings.py
- Fix for #112: Added checks for binaries required for document consumption.
- Removal of django-suit due to a licensing conflict I bumped into in 0.3.3. Note that you can use Django Suit with Paperless, but only in a non-profit situation as their free license prohibits for-profit use. As a result, I can't bundle Suit with Paperless without conflicting with the GPL. Further development will be done against the stock Django admin.
- I shrunk the thumbnails a little 'cause they were too big for me, even on my high-DPI monitor.
- BasicAuth support for document and thumbnail downloads, as well as the Push API thanks to @thomasbrueggemann. See #179.
- Thumbnails in the UI and a Django-suit -based face-lift courtesy of @ekw!
- Timezone, items per page, and default language are now all configurable, also thanks to @ekw.
- Fix for #172: defaulting ALLOWED_HOSTS to
["*"]
and allowing the user to set her own value viaPAPERLESS_ALLOWED_HOSTS
should the need arise.
- Added a default value for
CONVERT_BINARY
- Updated to using django-filter 1.x
- Added some system checks so new users aren't confused by misconfigurations.
- Consumer loop time is now configurable for systems with slow writes. Just
set
PAPERLESS_CONSUMER_LOOP_TIME
to a number of seconds. The default is 10. - As per #44, we've removed support for
PAPERLESS_CONVERT
,PAPERLESS_CONSUME
, andPAPERLESS_SECRET
. Please usePAPERLESS_CONVERT_BINARY
,PAPERLESS_CONSUMPTION_DIR
, andPAPERLESS_SHARED_SECRET
respectively instead.
- #150: The media root is now a variable you can set in
paperless.conf
. - #148: The database location (sqlite) is now a variable you can set in
paperless.conf
. - #146: Fixed a bug that allowed unauthorised access to the
/fetch
URL. - #131: Document files are now automatically removed from disk when they're deleted in Paperless.
- #121: Fixed a bug where Paperless wasn't setting document creation time based on the file naming scheme.
- #81: Added a hook to run an arbitrary script after every document is consumed.
- #98: Added optional environment variables for ImageMagick so that it doesn't explode when handling Very Large Documents or when it's just running on a low-memory system. Thanks to Florian Harr for his help on this one.
- #89 Ported the auto-tagging code to correspondents as well. Thanks to Justin Snyman for the pointers in the issue queue.
- Added support for guessing the date from the file name along with the correspondent, title, and tags. Thanks to Tikitu de Jager for his pull request that I took forever to merge and to Pit for his efforts on the regex front.
- #94: Restored support for changing the created date in the UI. Thanks to Martin Honermeyer and Tim White for working with me on this.
- Potentially Breaking Change: All references to "sender" in the code have been renamed to "correspondent" to better reflect the nature of the property (one could quite reasonably scan a document before sending it to someone.)
- #67: Rewrote the document exporter and added a new importer that allows for full metadata retention without depending on the file name and modification time. A big thanks to Tikitu de Jager, Pit, Florian Jung, and Christopher Luu for their code snippets and contributing conversation that lead to this change.
- #20: Added unpaper support to help in cleaning up the scanned image before it's OCR'd. Thanks to Pit for this one.
- #71 Added (encrypted) thumbnails in anticipation of a proper UI.
- #68: Added support for using a proper config file at
/etc/paperless.conf
and modified the systemd unit files to use it. - Refactored the Vagrant installation process to use environment variables
rather than asking the user to modify
settings.py
. - #44: Harmonise environment variable names with constant names.
- #60: Setup logging to actually use the Python native logging framework.
- #53: Fixed an annoying bug that caused
.jpeg
and.JPG
images to be imported but made unavailable.
- Docker support! Big thanks to Wayne Werner, Brian Conn, and Tikitu de Jager for this one, and especially to Pit who spearheadded this effort.
- A simple REST API is in place, but it should be considered unstable.
- Cleaned up the consumer to use temporary directories instead of a single scratch space. (Thanks Pit)
- Improved the efficiency of the consumer by parsing pages more intelligently and introducing a threaded OCR process (thanks again Pit).
- #45: Cleaned up the logic for tag matching. Reported by darkmatter.
- #47: Auto-rotate landscape documents. Reported by Paul and fixed by Pit.
- #48: Matching algorithms should do so on a word boundary (darkmatter)
- #54: Documented the re-tagger (zedster)
- #57: Make sure file is preserved on import failure (darkmatter)
- Added tox with pep8 checking
- Added support for parallel OCR (significant work from Pit)
- Sped up the language detection (significant work from Pit)
- Added simple logging
- Added support for image files as documents (png, jpg, gif, tiff)
- Added a crude means of HTTP POST for document imports
- Added IMAP mail support
- Added a re-tagging utility
- Documentation for the above as well as data migration
- Added automated tagging basted on keyword matching
- Cleaned up the document listing page
- Removed
User
andGroup
from the admin - Added
pytz
to the list of requirements
- Added basic tagging
- Added language detection
- Added datestamps to
document_exporter
. - Changed
settings.TESSERACT_LANGUAGE
tosettings.OCR_LANGUAGE
.
- Initial release