The Risk of Spam Complaints
David Relson
relson at osagesoftware.com
Mon Oct 21 13:58:55 CEST 2002
At 04:18 AM 10/21/02, Boris 'pi' Piwinger wrote:
>Hi!
>
>I just got a false positive. It was a spam complaint I wrote, of
>course, including the original spam (quoted). I bcc'ed the address the
>spam was delivered to.
>
>Now clearly that mail of mine contained all the bad words. So I had to
>-N it. But then this makes the bad word better again. I don't have a
>solution to this, though.
>
>pi
Hi pi,
I've had two such occurrences - one from an email I sent that showed spam
calculations (words and their spamicity) that was quoted when the reply
came back and a second when someone sent me a tarball of their
wordlists. Knowing the context of the messages, as a human I'd call them
non-spam. Given that they contained lots of spammish words, bogofilter was
justified in calling them spam. If I was doing manual filtering, I would
update neither word list.
One solution is a white list. Mail from the bogofilter mailing lists is
accepted, without updating any wordlists. Perhaps I'll implement this as a
procmail recipe...
The bigger question is: "What's the harm in (mis)classifying a few
messages about spam?". I assert that there is no harm. Currently my spam
list has 107,000 words and 6,000 messages and my non-spam list has 285,000
words and 29,000 messages. Adding a few messages and their few hundred
words to the wrong list is going to have very little effect.
David
For summay digest subscription: bogofilter-digest-subscribe at aotto.com
More information about the Bogofilter
mailing list