Filters that Fight Back

michael at optusnet.com.au michael at optusnet.com.au
Tue Aug 12 00:33:01 CEST 2003


Matthias Andree <matthias.andree at gmx.de> writes:
> David Relson <relson at osagesoftware.com> writes:
> 
> > Hi y'all,
> >
> > A new article by Paul Graham "Filters that Fight Back",
> > http://www.paulgraham.com/ffb.html
> >
> > 'Tis good reading.
> 
> So should we drop the "minimum token size" limit to deal with " B R O K
> E N   U P " tokens?
> 
> One other thing that we should try is tagging all header tokens (rather
> than just subject) with a h: so we can distinguish header from body in
> the data base. I'm wondering if we should also tag the body for
> consistency, or just introduce an additional first character. (We

No, just left the body tokens untagged. You just want the header
to be different than the body, and paying an extra 2 bytes on every
body token is probably going to be a bit much... :)

> clearly need a .FORMAT token in the data base so we can tell
> bogoutil/bogoupgrade what to do, we'd use 1 for all past formats, and
> bump to 2 as we tag header/body.)

Sounds like a good idea. Q: What should bogofilter do when it see's
a format different than it expects??




More information about the Bogofilter mailing list