Keeping the cruft out (was Re: no To: header in emails)
Eric Wood
eric at interplas.com
Wed Mar 3 16:29:12 CET 2004
Bob George wrote:
> I'm going to some lengths to avoid cruft in bayes as well:
I know there are mailling list purists here that would like to move this
dicussion on the procmail mail list (which I'm also a member of), but I
believe there should be a little more maildrop/procmail help on the
bogofilter website to get people started in the right direction.
Bogofilter is great but it's time to polish the package a little bit. For
example,
# Strip Useless TABS and HTML comments, used to split up words
:0 HB
* < 20000
* text/html
{
:0fwb
| expand | sed -e :a -e 's/<!--[^-]*-->//g;/</N;//ba'
}
By not splitting up words, I feel this will help out bogofilter down the
road by not having unnessary tokens. But someone may find that this kind of
"cruft-cleaning" doesn't help bogofilter out.
-eric wood
More information about the Bogofilter
mailing list