Dealing with wordlist mails
Peter Bishop
pgb at adelard.com
Wed Jan 28 20:41:18 CET 2004
> Anyone who uses -u should already be aware that not doing corrections
> will tend to screw up their database. This is true for any min_dev less
> than like 0.9. But in that case, what's the point of using -u? Using
> bogofilter is not an automatic process. No matter what, you have to do
> corrections, but especially if using -u.
I agree
In fact, I don't think that the default setting in the procmailrc
example should have -u.
I think the recommendation should be
- initial bulk training with a corpus
- followed by **manual** "train on error"
If the user forgets a manual update it doesn't make the situation
worse, unlike the -u case where forgetting to correct the entry leads
to database pollution.
"Train on error" could be eased by implementing support scripts to
allow -s and -n submissions to the mail server.
e.g. a spam is forwarded as an attachment to the user's account with
a special sub-address:
"username+spam at company.com"
and the user's procmail script tests for:
"username+spam"
extracts the attachment and invokes "bogofilter -s"
Similarly:
"username+ham" invokes "bogofilter -n"
And maybe something for fixing any errors in this process
"username+ham2spam" invokes "bogofilter -Ns"
"username+spam2ham" invokes "bogofilter -Sn"
--
Peter Bishop
pgb at adelard.com
pgb at csr.city.ac.uk
More information about the Bogofilter
mailing list