BanBuilder is a PHP package for profanity filtering. The PHP script uses regex to intelligently look for "leetspeak"-style numeric or symbol replacements.
To install BanBuilder, simply include it in your projects's composer.json
.
"snipe/banbuilder": "dev-master",
There are no additional dependencies required for this package to work.
use Snipe\BanBuilder\CensorWords;
$censor = new CensorWords;
$badwords = $censor->setDictionary();
$string = $censor->censorString($yourstring,$badwords, '*');
This returns $string
as an array, where you can access $string['clean']
for the cleaned version of the $yourstring
, or $string['orig']
, which will give you the original $yourstring
.
In a nutshell, this code takes an array of bad words and compares it to an array of common filter-evasion tactics. It then does a string replacement to insert regex parameters into your badwords array, and then evaluates your input string to that expanded banned word list.
So in your bad words array, you might have:
[0] => 'ass'
The preg_replace
functions replace all of the possible shenaningan letters with regex patterns (in lieu of adding the variants onto the end of the array), so the 'ass' in your array gets turned into this, right before the preg_replace
checks for matches:
[0] => /(a|a\.|a\-|4|@|Á|á|À|Â|à|Â|â|Ä|ä|Ã|ã|Å|å|α)(s|s\.|s\-|5|\$|§)(s|s\.|s\-|5|\$|§)/i
This means that a word can have none, one or any variety of leet replacements and it will still trip the trigger. Part of the leet filter includes stripping out letter-dash and letter-dots.
This means that the following all evaluate to the "bitch":
- B1tch
- bi7tch
- b.i.t.c.h.
- b-i-t-c-h
- b.1.t.c.h.
- ßitch
- and so on....
Copyright (C) 2013 Alison Gianotto - [email protected]
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.