Filter breakers
Thomas Anderson
tanderso at oac-design.com
Sat Apr 5 07:43:07 CEST 2008
I also use the milter, but only to drop the very spammy spam (>0.9).
Whatever is left over goes through the MDA, where the headers get the
extra special treatment which catches most of the rest of them.
Tom
On Sat, 2008-04-05 at 11:16 +0930, Stephen Davies wrote:
> Thanks Tom. Looks interesting.
>
> I use sendmail, milter and amavisd to invoke bogofilter so I'll have to think
> about how your code could be included but it certainly looks as if it could
> help.
>
> Cheers,
> Stephen
>
> On Saturday 05 April 2008 01:03, Tom Anderson wrote:
> > Stephen,
> >
> > I wrote a prefilter to handle exactly this kind of problem. The source
> > is available here: http://orderamidchaos.com/bogofilter/spamitarium
> >
> > I've been using it for 3 or 4 years now, and it works wonderfully for
> > helping to classify spams in which the headers play an outsized role.
> >
> > If you use it, I would appreciate feedback.
> >
> > Tom
> >
> > http://www.linkedin.com/in/orderamidchaos
> >
> > Stephen Davies wrote:
> > > I am still getting too many "obvious" spams slipping through my
> > > bogofilter setup.
> > >
> > > The more I investigate, the more it seems that quite innocuous headers
> > > are at least part of my problem.
> > >
> > > The following bogoutil output is quite common. The obviously spam
> > > components are outweighed by quite harmless header tokens - one of the
> > > most commonly appearing being the current month header (head:Apr).
> > >
> > > Is there any way to push such header tokens out of the picture?
> > > (In the example below for example, the to:anonymous token is ignored even
> > > though the word counts are quite skewed: 23351 to 388.)
> > >
> > > My database is some 200Mb with 3.5 million tokens.
> > >
> > > TIA,
> > > Stephen Davies
> > >
> > > X-Bogosity: Ham, tests=bogofilter, spamicity=0.500000, version=1.1.5
> > > n pgood pbad fw U
> > > "head:X-KMail-EncryptionState" 262 0.014395 0.000042 0.002937 +
> > > "head:X-KMail-MDN-Sent" 262 0.014395 0.000042 0.002937 +
> > > "head:X-KMail-SignatureState" 262 0.014395 0.000042 0.002937 +
> > > "head:X-Status" 262 0.014395 0.000042 0.002937 +
> > > "head:ASHT" 1 0.000057 0.000000 0.009094 +
> > > "head:cookie" 1 0.000057 0.000000 0.009094 +
> > > "rcvd:c12.groups.msn.com" 1 0.000057 0.000000 0.009094 +
> > > "head:Server" 84 0.003957 0.000057 0.014339 +
> > > "head:http" 3782 0.173596 0.002875 0.016297 +
> > > "head:Status" 550 0.016631 0.000990 0.056210 +
> > > "head:Mail" 652 0.018294 0.001268 0.064843 +
> > > "head:Performance" 2 0.000057 0.000004 0.066313 +
> > > "head:X-Server" 2 0.000057 0.000004 0.066313 +
> > > "head:surgemail.com" 2 0.000057 0.000004 0.066313 +
> > > "rcvd:SMTPSVC" 3950 0.096519 0.008634 0.082112 +
> > > "rcvd:Microsoft" 3948 0.096404 0.008634 0.082201 +
> > > "head:Apr" 161 0.003498 0.000381 0.098227 +
> > > "head:us-ascii" 11023 0.164994 0.031025 0.158275 -
> > > "head:X-User" 4 0.000057 0.000011 0.167700 -
> > > "head:Content-Transfer-Encoding" 48462 0.595917 0.144997 0.195700
> > > - "rcvd:with" 45772 0.555313 0.137448 0.198407 -
> > > "head:charset" 49550 0.590010 0.149533 0.202197 -
> > > "rcvd:mustang.sdc.com.au" 4193 0.048632 0.012740 0.207584 -
> > > "head:bit" 45510 0.522338 0.138640 0.209751 -
> > > "one" 51375 0.585938 0.156754 0.211062 -
> > > "head:plain" 42478 0.466938 0.130772 0.218788 -
> > > "rcvd:SMTP" 16245 0.176636 0.050140 0.221100 -
> > > "head:text" 50496 0.528531 0.157219 0.229266 -
> > > "head:From" 1965 0.019212 0.006208 0.244220 -
> > > "rcvd:Fri" 10516 0.090153 0.034064 0.274230 -
> > > "rcvd:from" 73690 0.621437 0.239385 0.278089 -
> > > "head:Content-Type" 121137 0.920227 0.400249 0.303110 -
> > > "url:85" 244 0.001835 0.000807 0.305556 -
> > > "head:High" 348 0.002466 0.001162 0.320224 -
> > > "rcvd:pickup" 498 0.003498 0.001664 0.322390 -
> > > "rcvd:service" 501 0.003498 0.001676 0.323887 -
> > > "are" 84056 0.556919 0.283150 0.337056 -
> > > "rcvd:mail" 568 0.003670 0.001920 0.343399 -
> > > "head:Fri" 385 0.002409 0.001306 0.351647 -
> > > "the" 158589 0.889660 0.544919 0.379846 -
> > > "and" 156941 0.832483 0.542439 0.394524 -
> > > "head:Date" 122391 0.602397 0.426132 0.414312 -
> > > "head:Message-ID" 108546 0.531800 0.378091 0.415534 -
> > > "rcvd:Apr" 4258 0.020416 0.014861 0.421264 -
> > > "head:X-Mailer" 96664 0.345013 0.345242 0.500165 -
> > > "head:rootsquest.com" 0 0.000000 0.000000 0.520000 -
> > > "rcvd:n0d915383632d4" 0 0.000000 0.000000 0.520000 -
> > > "rtrn:eddy" 0 0.000000 0.000000 0.520000 -
> > > "rtrn:rootsquest.com" 0 0.000000 0.000000 0.520000 -
> > > "url:85.75.79" 0 0.000000 0.000000 0.520000 -
> > > "url:
> > >
> > > SPAM-ADDRESS: 85.75.79.173
> > >
> > > http://www.rulesemporium.com/cgi-bin/uribl.cgi?domain0=85.75.79.173&bl0=0
> > >
> > > " 0 0.000000 0.000000 0.520000 -
> > > "Gate" 289 0.000688 0.001055 0.605202 -
> > > "http" 196856 0.447095 0.720053 0.616934 -
> > > "to:sdc.com.au" 240854 0.467626 0.886260 0.654604 -
> > > "online" 17982 0.030338 0.066471 0.686623 -
> > > "to:anonymous" 23739 0.022252 0.088935 0.799871 -
> > > "trusted" 742 0.000688 0.002780 0.801579 -
> > > "to:scldad" 117239 0.106211 0.439462 0.805358 -
> > > "largest" 3764 0.001262 0.014252 0.918670 +
> > > "head:eddy" 1 0.000000 0.000004 0.991605 +
> > > "url:85.75" 2 0.000000 0.000008 0.995766 +
> > > "from:rootsquest.com" 4 0.000000 0.000015 0.997873 +
> > > "
> > >
> > > SPAM-ADDRESS: grandonliencasino.com
> > >
> > > http://www.rulesemporium.com/cgi-bin/uribl.cgi?domain0=grandonliencasino.
> > >com&bl0=0
> > >
> > > " 4 0.000000 0.000015 0.997873 +
> > > "mostt" 4 0.000000 0.000015 0.997873 +
> > > "from:eddy" 17 0.000000 0.000065 0.999498 +
> > > "casino_bonus" 42 0.000000 0.000160 0.999797 +
> > > "subj:casino_bonus" 42 0.000000 0.000160 0.999797 +
> > > "from:Inman" 60 0.000000 0.000229 0.999858 +
> > > "from:Candace" 79 0.000000 0.000301 0.999892 +
> > > "casinos" 505 0.000000 0.001923 0.999983 +
> > > "Golden" 658 0.000000 0.002506 0.999987 +
> > > "casino" 2956 0.000000 0.011258 0.999997 +
> > > N_P_Q_S_s_x_md 31 0.000000 0.000000 0.500000
> > > 0.017800 0.520000 0.375000
> >
> > _______________________________________________
> > Bogofilter mailing list
> > Bogofilter at bogofilter.org
> > http://www.bogofilter.org/mailman/listinfo/bogofilter
>
> --
> ========================================================================
> This email is for the person(s) identified above, and is confidential to
> the sender and the person(s). No one else is authorised to use or
> disseminate this email or its contents.
>
> Stephen Davies Consulting Voice: 08-8177 1595
> Adelaide, South Australia. Fax: 08-8177 0133
> Computing & Network solutions. Mobile:0403 0405 83
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter
More information about the Bogofilter
mailing list