Releases: tmcw/notfoundbot
Update to Node 20
Node 16 actions are deprecated, so this release updates to Node 20. Thanks @parkr for the PR!
Additional markdown extensions
This release updates notfoundbot to include all of the additional extensions for Markdown files that Jekyll supports: md, markdown, mkdown, mkdn, and mkd.
Configurable content directory
This release makes the content directory notfoundbot looks in configurable. Thanks @mattdsteele for this improvement!
Added whitelist
This release includes a contribution from @steveoh: you can now add a list of domains that you don't want notfoundbot to check, and it will skip them.
name: notfoundbot
on:
schedule:
- cron: "0 5 * * *"
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Fix links
uses: tmcw/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
EXCEPTIONS: www.host.com thisisok.org
Relaxed SSL verification
This release fixes a mismatch between the SSL certificates that Node.js thinks are valid and the SSL certificates that Chrome and Firefox accept. This make notfoundbot more liberal in what it considers a functional HTTPS website, in order to cut down on sites falsely reported as offline.
Two more fixes
This fixes two more issues with notfound bot:
- Previously, the recommended changes would add a spurious newline after replaced links. They don't anymore, which should lead to smoother commits.
- This also automatically links to the HTTPS version of archive.org URLs rather than the HTTP version.
A fix and some refactoring
The main impetus for this release is a fix for how notfoundbot parses post frontmatter. Jekyll supports YAML frontmatter, and allows you to specify the same key (or property) in that YAML file multiple times. So a post could have frontmatter like this:
title: Hi
title: Hey
The yaml parser that notfoundbot was not allowing this and instead would crash if it encountered duplicate keys. This new release configures that parser to allow duplicate keys, thus fixing compatibility with Jekyll sites.
Notfoundbot's first release!
I've been testing notfoundbot, this tool for finding & automatically updating dead links in old blog content. I referred to it before as linkrot but it deserved a rename and needed one because there's already a user on here named linkrot
.
This is an action that checks for broken links and links to HTTP URLs, and upgrades them to HTTPS or relinks them to archives in the Wayback machine.
I have it configured on my site a bit like this:
name: notfoundbot
on:
schedule:
- cron: "0 5 * * *"
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Fix links
uses: tmcw/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
This'll run once a day at 5 o'clock, and create a PR with fixed links that you can merge if you want to!
First pre-release
This is the first cut of linkrot, reborn as a GitHub Action and simplified for clarity.