Do we need an exclusion list or something?
Eric Seppanen
eds at reric.net
Fri Sep 13 21:26:25 CEST 2002
On Fri, Sep 13, 2002 at 03:18:36PM -0400, Paul Tomblin wrote:
> I was looking at a message that had been miscategorized as spam, and I see
> that most of the words returned by "bogofilter -v" with high numbers are
> ones that are on every single email message I recieve, spam or not:
<snip>
>
> "edt", "for", "esmtp", "with", "postfix", "allhats.xcski.com",
> "localhost", "from", "received", "allhats", "delivered-to", "return-path",
> "sep" and "xcski.com" are going to be in the headers of every single
> message I recieve, spam or not. How can I stop it from classifying these
> messages as spam? Is it because the account this one is on hasn't
> received enough non-spam to train bogofilter properly?
In my opinion this will always be a problem. I spotted this when I fed it
a bunch of spam messages from the month of May and then found that the
word "may" was being treated as a very strong indicator of spamicity.
I have written an "ignore-list" patch, but it depends on my "multi-list"
patch, and I haven't received much feedback on that yet.
If you want to test my "ignore-list" patch let me know.
For summay digest subscription: bogofilter-digest-subscribe at aotto.com
More information about the Bogofilter
mailing list