Dealing with wordlist mails

Peter Bishop pgb at adelard.com
Wed Jan 28 20:41:18 CET 2004


> Anyone who uses -u should already be aware that not doing corrections
> will tend to screw up their database.  This is true for any min_dev less
> than like 0.9.  But in that case, what's the point of using -u?  Using
> bogofilter is not an automatic process.  No matter what, you have to do
> corrections, but especially if using -u.  

I agree

In fact, I don't think that the default setting in the procmailrc 
example should have -u.

I think the recommendation should be 
- initial bulk training with a corpus
- followed by **manual** "train on error"

If the user forgets a manual update it doesn't make the situation 
worse, unlike the -u case where forgetting to correct the entry leads 
to database pollution.

"Train on error" could be eased by implementing support scripts to 
allow -s and -n submissions to the mail server.

e.g. a spam is forwarded as an attachment to the user's account with 
a special sub-address:
 
"username+spam at company.com"

and the user's procmail script  tests for:

"username+spam"

extracts the attachment and invokes "bogofilter -s"

Similarly:
"username+ham" invokes "bogofilter -n"

And maybe something for fixing any errors in this process

"username+ham2spam" invokes "bogofilter -Ns"
"username+spam2ham" invokes "bogofilter -Sn"



-- 
Peter Bishop 
pgb at adelard.com
pgb at csr.city.ac.uk






More information about the Bogofilter mailing list