New software uploaded [was: Problems with Asian Spam]

Nigel Henry cave.dnb at tiscali.fr
Wed Nov 22 19:08:52 CET 2006


On Wednesday 22 November 2006 17:11, Tom Anderson wrote:
> David Relson wrote:
> > I suspect part of bogofilter's slowness in learning these are spam
> > is that my wordlist has approx 500,000 messages in it and this
> > causes learning to be slow.
> >
> > I'm thinking of adding a "--scale" option to bogoutil that would allow
> > counts to be scaled.  For example, scaling to 10,000 would scale counts
> > from 1...N to 1..10000.
> >
> > Whether this helps can be tested by registering a bunch of false
> > negatives with old wordlist and again with scaled wordlist and seeing
> > if messages scores are more appropriate.
>
> Training to exhaustion works wonders for me.  If you get a false
> negative, train it as spam again and again until it correctly classifies
> as spam.  This wipes out any scaling issues.  Usually only one training
> run does it, but some spams take 5 or more.  Bfproxy does this
> automatically for me, and I rarely see the same spam twice.
>
> http://www.orderamidchaos.com/bogofilter/bfproxy
>
> Tom

Hi Tom. Bearing in mind that I'm processing mail directly downloaded to Kmail, 
how do I reprocess an unsure through bogofilter to see if it's now detected 
as spam, that is, after training bogofilter with the unsure.

I'll have a look at bfproxy. Would this work in my situation, with mail 
processed by bogofilter which is directly downloaded to Kmail?

Nigel. 



More information about the Bogofilter mailing list