auto update incrementing incorrectly?

David Relson relson at osagesoftware.com
Tue Sep 2 04:42:50 CEST 2008


On Mon, 01 Sep 2008 11:40:16 -0400
Thomas Anderson wrote:

> I've been having problems lately with Viagra spams coming through as
> unsure in unusually large numbers.  This is odd because I've never
> ever considered these to be ham, but this was my result when
> classifying one of them manually:
> 
>                                 n    pgood     pbad      fw     U
>  "subj:Viagra"              15667  0.011043  0.004035  0.267599 -
> 
> Then, after a little while and bunch of more Viagra spams passing
> through (classified correctly as spam) and auto-updating, this was the
> result:
> 
>  "subj:Viagra"              15684  0.011056  0.004039  0.267572 -
> 
> Why would "subj:Viagra" be getting hammier?  None of them were
> classified as ham.  Pgood is almost 3 times bigger than pbad when
> pgood should really be zero.  Here is my configuration:
> 
> spam_cutoff = 0.50
> ham_cutoff = 0.10
> min_dev = 0.25
> robx = 0.40
> robs = 0.22
> thresh_update=0.05
> 
> Any ideas?  Has anyone else seen behavior like this?
> 
> Tom

H'lo Tom,

FWIW, I've not seen such behavior.  Using "grep" and "bogoutil -p" to
check (case insensitively) ham/spam values for subj:viagra, all matching
tokens have 0 for their ham counts.

"bogoutil -d path_to_wordlist subj:Viagra" will also show when the
ham/spam counts were last updated.  That might be helpful.

Regards,

David



More information about the Bogofilter mailing list