html processing [was: Wanting a pre-db4 bogofilter]

Tom Anderson tanderso at oac-design.com
Tue Mar 8 02:21:37 CET 2005


On Mon, 2005-03-07 at 19:27, David Relson wrote:
> Bogofilter removes html comments, so "th<!--comment-->is" becomes
> "this" and "bef<font>aft" becomes "befaft".  The inclusion of spaces,
> i.e. "th <!--comment--> is", would result in two 2 character fragments.
> 
> I'd have to see a sample of the table/array to determine why bogofilter
> isn't doing what's wanted.

It really doesn't matter.  Just train on it and the fragments will
become spammy soon enough.

Tom


_______________________________________________
Bogofilter mailing list
Bogofilter at bogofilter.org
http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list