Result Based on a Single Token

Thomas Anderson tanderso at oac-design.com
Tue Oct 2 01:42:00 CEST 2007


Looks to me like a case of weak training and untuned config.  Firstly,
train on errors.  Then, adjust your robx, robs, min_dev, spam_cutoff,
and ham_cutoff.  Letting a statistical filter screen your messages
without knowing anything about the filter seems a bit reckless to me.
Bogofilter will only do what you tell it to, including what to consider
and what to ignore statistically via cutoffs and ranges.

Tom

On Mon, 2007-10-01 at 22:38 +0100, RW wrote:
> I just noticed an email to a mailing list where it seems that a very
> high spam probability was based on a single token that had only been
> seen twice.
> 
> The filtering was done by Tuffmail, so I don't now any details about
> the version, or configuration of Bogofilter.
> 
> Is this normal behaviour? It seems a bit reckless to me.
> 
> 
> 
> ----------------------------------------------------------------------
> Y 0.995766
>    int  cnt   prob  spamicity histogram
>   0.00    0 0.000000 0.000000 
>   0.10    0 0.000000 0.000000 
>   0.20    0 0.000000 0.000000 
>   0.30    0 0.000000 0.000000 
>   0.40    0 0.000000 0.000000 
>   0.50    0 0.000000 0.000000 
>   0.60    0 0.000000 0.000000 
>   0.70    0 0.000000 0.000000 
>   0.80    0 0.000000 0.000000 
>   0.90    1 0.995766 0.995766 #
> 
>                                       n    pgood     pbad      fw     U
> "changed"                            22  0.074890  0.014663  0.164021 -
> "freebsd-questions"                 109  0.361233  0.079179  0.179839 -
> "freebsd-questions-unsubscribe"     109  0.361233  0.079179  0.179839 -
> "head:freebsd-questions"            109  0.361233  0.079179  0.179839 -
> "head:freebsd-questions-request"     109  0.361233  0.079179  0.179839 -
> "head:owner-freebsd-questions"      109  0.361233  0.079179  0.179839 -
> "head:questions"                    109  0.361233  0.079179  0.179839 -
> "rcvd:owner-freebsd-questions"      109  0.361233  0.079179  0.179839 -
> "rtrn:owner-freebsd-questions"      109  0.361233  0.079179  0.179839 -
> "head:Delivered-To"                 154  0.497797  0.120235  0.194582 -
> "head:List-Post"                    154  0.497797  0.120235  0.194582 -
> "head:FreeBSD.org"                   15  0.048458  0.011730  0.195277 -
> "head:unsubscribe"                  160  0.515419  0.126100  0.196600 -
> ...
> "rcvd:embarqmail.com"                 0  0.000000  0.000000  0.520000 -
> "rcvd:mig01.embarq.synacor.com"       0  0.000000  0.000000  0.520000 -
> "subj:Address"                        0  0.000000  0.000000  0.520000 -
> "rcvd:questions"                     16  0.017621  0.035191  0.666178 -
> "subj:Email"                          2  0.000000  0.005865  0.995766 +
> N_P_Q_S_s_x_md                        1  0.004234  0.995766  0.995766
>                                          0.017800  0.520000  0.375000
> 
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter




More information about the Bogofilter mailing list