Keeping the cruft out (was Re: no To: header in emails)

Eric Wood eric at
Wed Mar 3 16:29:12 CET 2004

Bob George wrote:
> I'm going to some lengths to avoid cruft in bayes as well:

I know there are mailling list purists here that would like to move this
dicussion on the procmail mail list (which I'm also a member of), but I
believe there should be a little more maildrop/procmail help on the
bogofilter website to get people started in the right direction.
Bogofilter is great but it's time to polish the package a little bit.  For

# Strip Useless TABS and HTML comments, used to split up words
:0 HB
* < 20000
* text/html
  | expand | sed -e :a -e 's/<!--[^-]*-->//g;/</N;//ba'

By not splitting up words, I feel this will help out bogofilter down the
road by not having unnessary tokens.  But someone may find that this kind of
"cruft-cleaning" doesn't help bogofilter out.

-eric wood

More information about the Bogofilter mailing list