rkimber at ntlworld.com
Thu Apr 8 08:19:53 EDT 2004
Sorry if you get two of these. Sylpheed replied to the List-Id which is
still the old address, so I've re-sent to the new one because I'm not
sure whether they are both working.
On Thu, 8 Apr 2004 07:26:14 -0400
David Relson <relson at osagesoftware.com> wrote:
> > >From time to time I get spam that is still scored as ham after I've
> > >told bogofilter to relearn it as spam.
> > This seems to be an artifact of full training. However,
> > besides the solutions already offered, those messages might
> > be an indication that you did some false training in the
> > past.
> A large wordlist with lots of messages has a measure of "inertia".
> With such a list, each message is a very small percentage of the
> total. This can make it harder to change the status of a token from
> "hammish" to"spammish". You may be encountering that effect.
Thanks. I'll have to explore further, as you suggest.
The particular message that triggered my question had 100 random words,
vitually all of which could have occurred in ham messages, and about 30
words in the spam message. It was almost as if the random words had
been chosen for my benefit. I run a political science website, and it
had words like House Executive welfare Parliament ... and so on.
More information about the Bogofilter