mime question
David Relson
relson at osagesoftware.com
Thu May 1 03:00:49 CEST 2003
At 07:30 PM 4/30/03, m at mo.optusnet.com.au wrote:
>David Relson <relson at osagesoftware.com> writes:
> > At 05:54 PM 4/30/03, Michael O'Reilly wrote:
> > >The bit inside the parenthesis is the MTA doing a reverse lookup. The
> > >question is: Does the bit outside (user supplied) match the bit inside?
> > >(From the DNS). A mis-match it fairly frequent, but the bit outside
> > >is normally a subset of the bit inside. That's not data that
> > >bogofilter can currently detect.
> >
> > You could pretty easily have lexer_v3.l pass the whole Received: line to a
> > routine which creates a special token.
>
>That's what I was thinking.
> >
> > Michael,
> >
> > FWIW, I just did a quick check on the spam scores of "may", "forged", and
> > "unknown" and got:
> >
> > [relson at osage backup.d]$ bogoutil -p $BOGOFILTER_DIR may forged unknown
> > spam good Gra prob Rob prob
> > may 3340 6005 0.573095 0.572728
> > forged 12 199 0.127051 0.163150
> > unknown 5027 3000 0.801758 0.800925
>
>That's interesting that the numbers are so radically different.
> spam good Gra prob Rob prob
>may 8768 4982 0.537313 0.537128
>forged 4151 645 0.809398 0.807506
>unknown 1564 3284 0.239110 0.239773
>
>I suspect a different MTA? (are you using sendmail or
>something else?)
postfix.
>At the risk of digressing, what are your top spam indicators?
>
>$ bogoutil -d spamlist.db | awk '{print $1}' | bogoutil -p . |sort -n +4 |
>tail -n6
>safelist 1235 8 0.990278 0.979093
>recurring 1459 11 0.988703 0.979247
>opted-in 1979 16 0.987896 0.980909
>x-list-unsubscribe 1514 6 0.994030 0.984779
>h4f 1272 1 0.998810 0.987689
>x-info 2220 6 0.995921 0.989547
Here're my top 20:
$ bogoutil -d spamlist.db | awk '{print $1}' | bogoutil -p . | sort -n +4 |
tail -n20
raton 586 2 0.998588 0.985428
boca 588 2 0.998593 0.985477
opted 1088 18 0.993192 0.986134
znex 730 5 0.997170 0.986600
url:66.216 885 6 0.997199 0.988451
osagesoftware 1298 17 0.994603 0.988652
unsub.php 804 1 0.999485 0.989811
remove.asp 852 1 0.999514 0.990376
pbz 1117 6 0.997779 0.990816
customerservice 1831 20 0.995495 0.991253
url:65.61 1184 3 0.998951 0.992357
bfntrfbsgjner 1120 0 1.000000 0.993013
m25 1325 3 0.999063 0.993161
recurring 2427 16 0.997276 0.994055
postal 3494 29 0.996573 0.994336
t.pl 1625 3 0.999236 0.994412
x-id 1589 1 0.999739 0.994801
f2.6 2665 11 0.998293 0.995350
x-list-unsubscribe 2613 3 0.999525 0.996513
h4f 2497 1 0.999834 0.996681
Interestingly "x-list-unsubscrib", "h4f" and "recurring" are in both
lists. Also my domain name (without the .com) is right up there as is
"boca raton". Go figger!
More information about the Bogofilter
mailing list