Skip to content
forked from snipe/banbuilder

Composer package for censoring profanity in web applications, forums, etc.

Notifications You must be signed in to change notification settings

hibrid/banbuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BanBuilder Composer Package

Gitter Build Status Latest Stable Version Total Downloads Latest Unstable Version License

BanBuilder is a PHP package for profanity filtering. The PHP script uses regex to intelligently look for "leetspeak"-style numeric or symbol replacements.

Installing

To install BanBuilder, simply include it in your projects's composer.json.

"snipe/banbuilder": "dev-master",

There are no additional dependencies required for this package to work.

Usage

use Snipe\BanBuilder\CensorWords;
$censor = new CensorWords;
$badwords = $censor->setDictionary();
$string = $censor->censorString($yourstring,$badwords, '*');

This returns $string as an array, where you can access $string['clean'] for the cleaned version of the $yourstring, or $string['orig'], which will give you the original $yourstring.

Summary

In a nutshell, this code takes an array of bad words and compares it to an array of common filter-evasion tactics. It then does a string replacement to insert regex parameters into your badwords array, and then evaluates your input string to that expanded banned word list.

So in your bad words array, you might have:

 [0] => 'ass'

The preg_replace functions replace all of the possible shenaningan letters with regex patterns (in lieu of adding the variants onto the end of the array), so the 'ass' in your array gets turned into this, right before the preg_replace checks for matches:

 [0] => /(a|a\.|a\-|4|@|Á|á|À|Â|à|Â|â|Ä|ä|Ã|ã|Å|å|α)(s|s\.|s\-|5|\$|§)(s|s\.|s\-|5|\$|§)/i

This means that a word can have none, one or any variety of leet replacements and it will still trip the trigger. Part of the leet filter includes stripping out letter-dash and letter-dots.

This means that the following all evaluate to the "bitch":

  • B1tch
  • bi7tch
  • b.i.t.c.h.
  • b-i-t-c-h
  • b.1.t.c.h.
  • ßitch
  • and so on....

Flattr this git repo


License

Copyright (C) 2013 Alison Gianotto - [email protected]

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

About

Composer package for censoring profanity in web applications, forums, etc.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 100.0%