filter evasion

David Relson relson at osagesoftware.com
Fri Nov 7 18:26:02 CET 2003


On Fri, 7 Nov 2003 09:20:57 -0600
John McCain <jmccain at layer3al.com> wrote:

> On Friday 07 November 2003 07:21 am, Eric Wood wrote:
> > This is something I put in just before the call to bogofilter:
> >
> > Now, I guess it would be better if you did this:
> > * ? lynx -dump -stdin | bogofilter -u
> > instead of:
> > * ? lynx -dump -stdin | grep -i -f /etc/vmail/spam_words
> >
> 
> That works pretty well, although I think the lexer patch David just
> posted may duplicate the effect.
> 
> Now what do we do about the font=white words?

John,

I don't have a good answer for that, at present.  Since bogofilter is
scoring the innards of <font> tags, it has _some_ info on the ruse.

Using font=white seems like another form of having text/plain and
text/html parts with totally different information.  A couple of ways
have been suggested for dealing with this situation.   One is to ignore
text/plain when text/html is available (on the theory that the "real"
message in in the html).  Another way is to score each part and pick
one.  It's been suggested that the right one to use is the one with
score furthest from 0.5.  A different approach would be to pick the
spammiest one.

Anyhow, that's all at some point in the indefinite future --- unless
someone with more time than I wants to do it :-)

David




More information about the Bogofilter mailing list