Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
biff		biff
example		example
.gitignore		.gitignore
COPYING		COPYING
LICENSE		LICENSE
README.md		README.md
pdf-odt.png		pdf-odt.png
requirements.txt		requirements.txt

Repository files navigation

BIFF

Extract text and images from highlighted pdf generated with reMarkable tablet.

Version

Version 2

Add an option for two columns pdf.
Add an option to increase quality of cropped images.
Improvements on some artifacts (before the whole line was extracted when only a part of if was highlighted)
New Windows executable (v2)

Installation and usage

biff requires the following modules :

opencv-python
pymupdf
numpy
odfpy

biff needs Python 3

$ git clone https://github.com/soulisalmed/biff.git			
$ cd biff		
$ pip3 install -r requirements.txt					
$ python3 -m biff my_highlighted.pdf

Usage:

usage: __main__.py [-h] [-c] [-q QUALITY] [pdf [pdf ...]]

Extract highlighted text and framed images from PDF(s) generated with
reMarkable tablet to Openoffice text document. Highlighted text will be
exported as text. Framed areas will be cropped as images.

positional arguments:
  pdf                   PDF files

optional arguments:
  -h, --help            show this help message and exit
  -c, --two-columns     For two-columns pdf, parse colums from left to right
  -q QUALITY, --quality QUALITY
                        Quality of extracted images, default=2 higher values
                        for higher quality

For Windows users, you can download executables in releases. On the command line (cmd.exe):

biff_v2.0.exe my_highlighted.pdf

The output odt file will be placed in the same directory as the pdf.

Recommandations for pdf highlighting on the reMarkable tablet

On reMarkable, use the Highlighter. All the other tools will not (and should not) be detected by biff.
Make sure to cover all the text you want to extract. Partly covered text will not be extracted.
For figures, just draw a rectangle shape around it. The interior will be cropped and added as an image to the output odt.
Formulas will not be extracted as such, but you can export them as images (see example below).
The quality of the result will depend on the pdf quality. Automatically generated pdf from scans will give poor results.
Export the PDF using the reMarkable USB web interface (for example) and ./biff.py your_pdf.pdf

Enjoy and please send some feedback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

BIFF

Version

Version 2

Installation and usage

Recommandations for pdf highlighting on the reMarkable tablet

About

Licenses found

Releases 3

Packages

Contributors 3

Languages

License

Licenses found

soulisalmed/biff

Folders and files

Latest commit

History

Repository files navigation

BIFF

Version

Version 2

Installation and usage

Recommandations for pdf highlighting on the reMarkable tablet

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 3

Languages

Packages