Skip to content

Scrape URL's from Firefox bookmarks.html file and paste them in a .txt.

Notifications You must be signed in to change notification settings

MiiNK1Y/firefox-bookmark-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

logo

FFBMS

Firefox Bookmark Scraper

Get all the links from a Firefox Bookmarks.html, and put them in a .txt file.

Why?

I needed all 296 links in my Firefox bookmarks to interact with them through another script. But I wanted them in an organized, line-by-line formatted document. So I made this to accomplish that.

How to

  • Rip all urls from specific file
  • ~ $ ./ffbms.py [BOOKMARKS_FILE]

    Execute the script with your bookmarks file directory as an argument.
    (here, the bookmarks.html is in the same director as the script)

  • Rip all specific sites from specific file
  • ~ $ ./ffbms.py [BOOKMARKS_FILE] [SITE] [SITE1] [SITE2] [SO_ON_AND_SO_FORTH...]

    Example

    ~ $ ./ffbms.py bookmarks.html reddit.com wallhaven.cc wikipedia.com

    Just like above, but add your site(s) as arguments behind the bookmarks path.

  • Running the script without passing a file or site arguments
  • ~ $ ./ffbms.py

    If no file or arguments are passed, the script will automatically assume the name of the bookmarks file to be 'bookmarks.html' and in the same directory as the script, and then run. This is so that you can run the executable in Windows by double clicking the .exe file. Do realize that you can't pass site args this way. You will need to run the script with CMD or Powershell to do that.

    The script then generates a 'scraped.txt' with the links found in the bookmarks file.


    DISCLAIMER

    The script only works on URLs (in the bookmarks file) that have the following schemes/prefixes:

    "https://", "http://", "https://www.", "http://www."

    This is because how the seeker works. It uses those schemes to check if there is a URL in the current seeking line in the file. When you enter a site(s) as an argument, the script adds those prefixes to the site(s) to match all possible schemes the site might use. Some sites use "http://" instead of "https://", and some sites don't use the "www" subdomain prefix. Thus it is necessary to check for all possible combinations.

    Be vary that some sites might not be detected in the bookmarks because of this, for example "localhost" and maybe your router IP "192.168.0.1" or similar.

    About

    Scrape URL's from Firefox bookmarks.html file and paste them in a .txt.

    Resources

    Stars

    Watchers

    Forks

    Packages

    No packages published

    Languages