mime question

m at mo.optusnet.com.au m at mo.optusnet.com.au
Thu May 1 01:30:43 CEST 2003


David Relson <relson at osagesoftware.com> writes:
> At 05:54 PM 4/30/03, Michael O'Reilly wrote:
> >The bit inside the parenthesis is the MTA doing a reverse lookup. The
> >question is: Does the bit outside (user supplied) match the bit inside?
> >(From the DNS). A mis-match it fairly frequent, but the bit outside
> >is normally a subset of the bit inside. That's not data that
> >bogofilter can currently detect.
> 
> You could pretty easily have lexer_v3.l pass the whole Received: line to a 
> routine which creates a special token.

That's what I was thinking. 
> 
> Michael,
> 
> FWIW, I just did a quick check on the spam scores of "may", "forged", and 
> "unknown" and got:
> 
> [relson at osage backup.d]$ bogoutil -p $BOGOFILTER_DIR may forged unknown
>                         spam    good  Gra prob  Rob prob
> may                    3340    6005  0.573095  0.572728
> forged                   12     199  0.127051  0.163150
> unknown                5027    3000  0.801758  0.800925

That's interesting that the numbers are so radically different.
                       spam    good  Gra prob  Rob prob
may                    8768    4982  0.537313  0.537128
forged                 4151     645  0.809398  0.807506
unknown                1564    3284  0.239110  0.239773

I suspect a different MTA? (are you using sendmail or
something else?)

At the risk of digressing, what are your top spam indicators?

$ bogoutil -d spamlist.db | awk '{print $1}' | bogoutil -p . |sort -n +4 | tail -n6
safelist               1235       8  0.990278  0.979093
recurring              1459      11  0.988703  0.979247
opted-in               1979      16  0.987896  0.980909
x-list-unsubscribe     1514       6  0.994030  0.984779
h4f                    1272       1  0.998810  0.987689
x-info                 2220       6  0.995921  0.989547

Michael.




More information about the Bogofilter mailing list