Recent Activity

20190810 - Created pdfcrack.py

Even though the capability exists to attack large keyspace, I don't recommend that with this tool.
- This is after all, only a python script. :)
Without commandline switches, script will default to a maximum password of 4 digits. This is the recommended attack for inline analysis.
- You don't want mail delivery waiting on this to finish cracking huge keyspace for every PDF it sees.

History

20190723 - Moved crypto functions into their own class

I wanted the crypto functions separated because I have plans to use the library for my own pdf cracker. :)

20180801 - A couple things:

Added support for LZWDecode filter
- Borrowed code from pdfminer. It's efficient, and until I can write my own (because I like doing that sort of thing) this will be used.
Started implemeting the malware index checks.
- Checking for unecessary whitespace within object definitions
- Checking for unecessary encoding of chars in obect definitions

20180722 - Fixed Crypt filter logic

Discovered that the encryption key is calculated differently when a Crypt filter is used.
- https://forums.adobe.com/thread/1938926

20180601 - Added decryption capabilities (Set version to 2.2.0)

Standard Security Handler Version 5; Revision 6
- Tested code on AESV3 with no issues using user passwords, owner passwords and blank passwords
Code will exit upon discovering encryption if pycrypto modules have not been loaded
- So please do 'pip install pycrypto'

20180526

20180515

Decodes some page text
- Introduced new stream type: pdf_mcid
- Works on most (hopefully), but not all.
Decrypt documents that require a password
- Added command line switch to accomodate this: -p
Added new freebase function that can convert a number of any base to any other base
- Used this to rework the ascii85 decoder which was flawed.
Various other cosmetic / logic issues

20180312 - New pydf.py argument

20180228 - Page tracking

Added info to the summary that notes what page a hyperlink occurred on
- Not a page numbers; a custom identifier to aide in deciding if a page has been modified
- This will be needed for the LaikaBOSS module for that smarter detection I mentioned in the wiki.

20180131 - Additional processing capabilities added.

20180103 - Added decryption capabilities

Standard Security Handler Version 1, 2 & 4; Revision 2, 3, & 4
- Tested code on CFM methods V2 and AESV2 (RC4 and AES respectively)
Code will exit upon discovering encryption if pycrypto modules have not been loaded
- So please do 'pip install pycrypto'

20171128 - Major rewrite of some core functionality

pydf2json now returns a tuple
- The JSON struct of the PDF
- A summary of data that can be used to create meaningful output to end user.
- An object map for easy access to various objects within the JSON struct.