Bogofilter accuracy plummets starting around March 10, 2010

Dmitry vdb at mail.ru
Tue Apr 13 02:23:47 CEST 2010


David Relson wrote:
 > "Invisible headers" is not a term I recognize.

Headers like "To, cc, from, subject, date" -- are visible in almost all 
MUA. Everything else is usually invisible. When you allow bogofilter to 
process any invisible headers you pollute the database with random data 
and make spam detection innacurate. Real spammer has 2 way to break spam 
filter: 1. To make headers the same way as known mail user agents do; or 
2) to make random headers in each message. So, in any case it makes more 
harm than good from the viewpoint of spam detection.

1) You allow random data

> pipe your message through an appropriate "egrep -v "^(this|or|that):"
> command.

To send millions messages through pipe with egrep? No, thanks. It is 
unnecessary load on the server.

-- 
Dmitry



More information about the Bogofilter mailing list