Results skewed by headers
Thomas Anderson
tanderson at orderamidchaos.com
Fri May 15 23:12:43 CEST 2009
It sounds like maybe you need to review your "robs" setting so that a
little-seen token cannot become so influential so quickly. My robs =
0.22. I'm guessing yours is much lower.
Tom
Stephen Davies wrote:
> I am seeing a number of missed spams due to header counts - particularly from,
> rcvd and url - being one.
>
> That is, I see results such as:
> X-Bogosity: Ham, tests=bogofilter, spamicity=0.535987, version=1.2.0
>
> "from:butlerbuilder.com" 1 0.000079 0.000000 0.009094 +
> "from:welterweight00" 1 0.000079 0.000000 0.009094 +
> "url:190.159.9" 1 0.000079 0.000000 0.009094 +
> "url:190.159.9.229" 1 0.000079 0.000000 0.009094 +
> "easier" 3151 0.058157 0.007897 0.119560 +
>
> "Acai" 16862 0.002535 0.054990 0.955924 +
> "subj:Acai" 56342 0.002535 0.183987 0.986407 +
> "to:sdc" 22925 0.000158 0.074898 0.997888 +
> "rcvd:sdc.com.au" 89855 0.000079 0.293588 0.999730 +
> "rcvd:for" 93890 0.000079 0.306772 0.999742 +
> "rcvd:sdc" 9465 0.000000 0.030926 0.999999 +
> "rcvd:forged" 18247 0.000000 0.059620 1.000000 +
> "rcvd:may" 18248 0.000000 0.059623 1.000000 +
> "head:X-UIDL" 40005 0.000000 0.130712 1.000000 +
> "rcvd:with" 120472 0.000000 0.393629 1.000000 +
> N_P_Q_S_s_x_md 15 0.000000 0.071974 0.535987
> 0.017800 0.520000 0.375000
>
> The first five apparently outweigh the negative results.
> As soon as I run this through bogofilter -Ns, it becomes recognised as spam.
> This seems to confirm that the from and url headers are the most significant.
>
> Why is this and what can I do to stop it happening in future?
>
> Cheers and thanks.
> Stephen Davies
>
More information about the Bogofilter
mailing list