Results skewed by headers
Stephen Davies
scldad at sdc.com.au
Fri May 15 03:20:00 CEST 2009
I am seeing a number of missed spams due to header counts - particularly from,
rcvd and url - being one.
That is, I see results such as:
X-Bogosity: Ham, tests=bogofilter, spamicity=0.535987, version=1.2.0
"from:butlerbuilder.com" 1 0.000079 0.000000 0.009094 +
"from:welterweight00" 1 0.000079 0.000000 0.009094 +
"url:190.159.9" 1 0.000079 0.000000 0.009094 +
"url:190.159.9.229" 1 0.000079 0.000000 0.009094 +
"easier" 3151 0.058157 0.007897 0.119560 +
"Acai" 16862 0.002535 0.054990 0.955924 +
"subj:Acai" 56342 0.002535 0.183987 0.986407 +
"to:sdc" 22925 0.000158 0.074898 0.997888 +
"rcvd:sdc.com.au" 89855 0.000079 0.293588 0.999730 +
"rcvd:for" 93890 0.000079 0.306772 0.999742 +
"rcvd:sdc" 9465 0.000000 0.030926 0.999999 +
"rcvd:forged" 18247 0.000000 0.059620 1.000000 +
"rcvd:may" 18248 0.000000 0.059623 1.000000 +
"head:X-UIDL" 40005 0.000000 0.130712 1.000000 +
"rcvd:with" 120472 0.000000 0.393629 1.000000 +
N_P_Q_S_s_x_md 15 0.000000 0.071974 0.535987
0.017800 0.520000 0.375000
The first five apparently outweigh the negative results.
As soon as I run this through bogofilter -Ns, it becomes recognised as spam.
This seems to confirm that the from and url headers are the most significant.
Why is this and what can I do to stop it happening in future?
Cheers and thanks.
Stephen Davies
--
=============================================================================
Stephen Davies Consulting P/L Voice: 08-8177 1595
Adelaide, South Australia. Fax : 08-8177 0133
Computing & Network solutions. Mobile:040 304 0583
VoIP:sip:1132210 at sip1.bbpglobal.com
More information about the Bogofilter
mailing list