New software uploaded [was: Problems with Asian Spam]

Nigel Henry cave.dnb at tiscali.fr
Wed Nov 22 17:40:18 CET 2006


On Wednesday 22 November 2006 13:54, David Relson wrote:
> On Wed, 22 Nov 2006 07:41:30 -0500
>
> dhottinger at harrisonburg.k12.va.us wrote:
> > Arrrrgh, I refuse to live with spam.  Must be an answer.  The one
> > thats got me concerned are the spams with the inline images and the
> > random text behind the images.  Bogofilter doesnt seem to catch
> > those. Furthermore, Im not so sure that the random text (usually
> > snippets from some book) seem to be skewing my wordlist.  Im getting
> > a few more messages caught as spam that shouldnt be.  Although all of
> > these are ticket reminders, etc.  Which shouldnt be sent to my mail
> > server by users anyway.  If anyone has a good way to kill these spam
> > messages that would be great.
>
> My main source of "unsures" is asian spam sent to a mailing list.
>
> My main group of false negatives is messages titled "New software
> uploaded by ... on ...date...time..."
>
> The software messages have a hunk of software related text.  I've
> already received 70 of them this month.  Anybody else seeing these?

Hi David. I've been getting these turning up in the unsure box, and have been 
training with them. I'm not sure if any are now being correctly identified as 
spam, as I've had bogofilter sending the spam directly to the wastebin for a 
while, but I've just recreated a spamcheck folder to check it before trashing 
it. I'll see what the results are in the next few days. My wordlist only has 
1824 spam by the way.
>
> I suspect part of bogofilter's slowness in learning these are spam
> is that my wordlist has approx 500,000 messages in it and this
> causes learning to be slow.
>
> I'm thinking of adding a "--scale" option to bogoutil that would allow
> counts to be scaled.  For example, scaling to 10,000 would scale counts
> from 1...N to 1..10000.
>
> Whether this helps can be tested by registering a bunch of false
> negatives with old wordlist and again with scaled wordlist and seeing
> if messages scores are more appropriate.
>
> David

The worst spams I've been getting recently are those sent to mailing lists 
that I'm on. The worst being the video4linux list. Most end up in the unsure 
box now, but I'm still getting the odd one in the inbox.

Using bogofilter-1.0.2 on FC2, and directly processing mail downloaded to 
Kmail.

Nigel.
>
>
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list