New software uploaded [was: Problems with Asian Spam]
Nigel Henry
cave.dnb at tiscali.fr
Wed Nov 22 19:08:52 CET 2006
On Wednesday 22 November 2006 17:11, Tom Anderson wrote:
> David Relson wrote:
> > I suspect part of bogofilter's slowness in learning these are spam
> > is that my wordlist has approx 500,000 messages in it and this
> > causes learning to be slow.
> >
> > I'm thinking of adding a "--scale" option to bogoutil that would allow
> > counts to be scaled. For example, scaling to 10,000 would scale counts
> > from 1...N to 1..10000.
> >
> > Whether this helps can be tested by registering a bunch of false
> > negatives with old wordlist and again with scaled wordlist and seeing
> > if messages scores are more appropriate.
>
> Training to exhaustion works wonders for me. If you get a false
> negative, train it as spam again and again until it correctly classifies
> as spam. This wipes out any scaling issues. Usually only one training
> run does it, but some spams take 5 or more. Bfproxy does this
> automatically for me, and I rarely see the same spam twice.
>
> http://www.orderamidchaos.com/bogofilter/bfproxy
>
> Tom
Hi Tom. Bearing in mind that I'm processing mail directly downloaded to Kmail,
how do I reprocess an unsure through bogofilter to see if it's now detected
as spam, that is, after training bogofilter with the unsure.
I'll have a look at bfproxy. Would this work in my situation, with mail
processed by bogofilter which is directly downloaded to Kmail?
Nigel.
More information about the Bogofilter
mailing list