Skip to content

Convert Mozilla Archive Format files to compressed base64-encoded HTML framesets

License

Notifications You must be signed in to change notification settings

ArneBachmann/MAFF2HTML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

MAFF2HTML

This tool converts your MAFF archived websites to single base64-encoded HTML files, optionally compressed via LZMA or BZip2.

The HTML frames referenced by the MAFF's frameset and all their nested referenced resources like images, stylesheets, fonts and scripts are inlined using base64-encoded data: URLs (!).

This unique approach lets you keep all information of the MAFF in a single HTML file. I don't think this solution exists anywhere else.

Sadly, however, even modern (year 2020) browsers struggle loading larger HTML documents with frames and inline data, and may fail to render base64-encoded contents efficiently, or crash with out of memory errors.

Anyway I think this is a uniquely useful long-term solution for keeping your data without relying on browser-plugins to display your archived web pages.

Enjoy!

Usage

The basic usage is:

python[3] maff2html.py [--compress [--lzma]] [--keep-timestamp]

This will find all *.maff files in the current and all sub-folders and convert them into *.html.xz, *.html.bz2 or *.maff.html files, leaving the original *.maff file untouched.

Command-line options Compression File extension
none none .maff.html
--compress BZIP2 .html.bz2
--compress --lzma LZMA .html.xz

Contributors

  • Pepe Pardo (pppardo): in-memory unzipping, encoding fix

TODO

  • add pyproject.toml with console_script and dependencies
  • optional libraries: filetype, python-magic
  • use pathlib instead of os.path

About

Convert Mozilla Archive Format files to compressed base64-encoded HTML framesets

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages