Stripsearch
David Relson
relson at osagesoftware.com
Sun Jun 12 17:31:42 CEST 2005
On Sat, 11 Jun 2005 17:01:24 +1000
Mark Constable wrote:
> On Saturday 11 June 2005 10:33, Chris Fortune wrote:
> > > I'm using MIME::QuotedPrint. The only drawback is that now the email is
> > > being altered beyond simply inserting the tokens. Should I re-encode
> > > them before quitting?
> >
> > that would be sensible IMHO
>
> Looking forward to trying a new version.
>
> I'm not sure if others get empty messages but every now
> and then I get something like this...
>
> X-Bogosity: Unsure, tests=bogofilter, spamicity=0.520000, version=0.94.13
> Content-Type:
> X-UID: 5829
> X-Length: 77
>
> and nothing much else (mild variations). It just seems to
> me that stripsearch might be a good place to look for these
> and to insert something in the body that will then become
> a spam marker for bogofiltering.
>
> --markc
H'lo Mark,
What version of bogofilter are you using? Looking at my messages
for this month, I have 18 with neither Subject: line nor message body.
Bogofilter scored them between 0.831999 and 0.920263, which makes them
spam.
Running bogofilter -vvv shows that the following tokens contributed to
the scoring:
"rcvd:unknown" 86809 0.108868 0.500432 0.821323 +
"to:undisclosed-recipients" 1319 0.000809 0.008100 0.909205 +
"url:211" 4665 0.000255 0.030176 0.991625 +
"from:x8478" 1 0.000000 0.000007 0.994183 +
"rtrn:x8478" 1 0.000000 0.000007 0.994183 +
"url:211.197.27" 1 0.000000 0.000007 0.994183 +
"url:211.197.27.1" 1 0.000000 0.000007 0.994183 +
"from:altavista.com" 13 0.000000 0.000085 0.999548 +
"rtrn:altavista.com" 15 0.000000 0.000098 0.999609 +
Conclusion: message headers provide quite a bit of info for scoring
messages -- even when there's no body.
HTH,
David
P.S. The posting delay for you message was caused by using a non-
subscribed address, rather than your subscribed address.
More information about the Bogofilter
mailing list