spaced out spam words

Jason A. Smith jason-bf at jazbo.dyndns.org
Fri Jun 9 13:38:09 CEST 2006


On Fri, 2006-06-09 at 07:05, David Relson wrote:
> Correct!  You are showing the result of processing "For example here
> are ...".  I was showing some _examples_ of double-word tokens.
> 
> Now all I need is time to find my old patches, apply, and test them...

So this patch will handle the often requested multi-word feature and
deal with these spaced out words better.  Will it collapse/replace
multiple white-spaces (spaces, tabs and maybe newlines) with '+' before
adding to/checking the database?  What about html spam that often places
tags, spaces & newlines between single letters, such that when displayed
by an html viewer, still clearly shows the spammer's message?  Will the
bogofilter parser collapse white-spaces, tags and newlines allowing it
to combine the spaced out words in html spam?

~Jason




More information about the Bogofilter mailing list