email containing token with high spamcount only gets an unsure

Gerrit Thede beroot at gmail.com
Thu Jul 20 10:31:12 CEST 2006


On 7/19/06, Thomas Anderson <tanderso at oac-design.com> wrote:
>
> On Tue, 2006-07-18 at 19:27 -0400, David Relson wrote:
> > On Tue, 18 Jul 2006 14:07:46 +0200
>
> It's possible that the headers of the messages have hammish tokens and
> > they are counter-balancing the spammish tokens.  "-vvv" will show if
> > that is so.  Assuming this is the case, you _could_ create an ignore
> > database to tell bogofilter to ignore certain tokens when scoring
> > messages.




Or you could register the message as spam again until it classifies as
> spammy.  I've been doing exhaustive registration (again and again until
> it classifies correctly) for several years now with no adverse effects.
> It's like training a dog... sometimes it takes more than one reprimand
> before he gets it right.
>
> Tom


Hi,
thanks  for your answers. indeed the rest of these messages seem to balance
the scoring. Here's a histogram for another one of these:

bogofilter -C -d ~/.bogofilter -vv < msg.txt
X-Bogosity: Unsure, tests=bogofilter, spamicity=0.500000, version=1.0.3
   int  cnt   prob  spamicity histogram
  0.00   42 0.052268 0.030049 ##########################################
  0.10    7 0.111155 0.038573 #######
  0.20    0 0.000000 0.038573
  0.30    0 0.000000 0.038573
  0.40    0 0.000000 0.038573
  0.50    0 0.000000 0.038573
  0.60    0 0.000000 0.038573
  0.70    0 0.000000 0.038573
  0.80    0 0.000000 0.038573
  0.90   47 0.995219 0.546718###############################################


Maybe I just need more of these, but it's really annoying. I thought a
message with the same ending over and over again must have been really easy
to recognise as spam, but obviously it's not. bogofilter works perfectly for
nearly all of the spam I get and I don't even have false positives. but
these messages in particular seem to trick bogofilter really good.

Gerrit



More information about the Bogofilter mailing list