Filter breakers

Thomas Anderson tanderso at oac-design.com
Sat Apr 5 07:43:07 CEST 2008


I also use the milter, but only to drop the very spammy spam (>0.9).
Whatever is left over goes through the MDA, where the headers get the
extra special treatment which catches most of the rest of them.

Tom

On Sat, 2008-04-05 at 11:16 +0930, Stephen Davies wrote:
> Thanks Tom. Looks interesting.
> 
> I use sendmail, milter and amavisd to invoke bogofilter so I'll have to think 
> about how your code could be included but it certainly looks as if it could 
> help.
> 
> Cheers,
> Stephen
> 
> On Saturday 05 April 2008 01:03, Tom Anderson wrote:
> > Stephen,
> >
> > I wrote a prefilter to handle exactly this kind of problem.  The source
> > is available here: http://orderamidchaos.com/bogofilter/spamitarium
> >
> > I've been using it for 3 or 4 years now, and it works wonderfully for
> > helping to classify spams in which the headers play an outsized role.
> >
> > If you use it, I would appreciate feedback.
> >
> > Tom
> >
> > http://www.linkedin.com/in/orderamidchaos
> >
> > Stephen Davies wrote:
> > > I am still getting too many "obvious" spams slipping through my
> > > bogofilter setup.
> > >
> > > The more I investigate, the more it seems that quite innocuous headers
> > > are at least part of my problem.
> > >
> > > The following bogoutil output is quite common. The obviously spam
> > > components are outweighed by quite harmless header tokens - one of the
> > > most commonly appearing being the current month header (head:Apr).
> > >
> > > Is there any way to push such header tokens out of the picture?
> > > (In the example below for example, the to:anonymous token is ignored even
> > > though the word counts are quite skewed: 23351 to 388.)
> > >
> > > My database is some 200Mb with 3.5 million tokens.
> > >
> > > TIA,
> > > Stephen Davies
> > >
> > > X-Bogosity: Ham, tests=bogofilter, spamicity=0.500000, version=1.1.5
> > >                                         n    pgood     pbad      fw     U
> > >   "head:X-KMail-EncryptionState"      262  0.014395  0.000042  0.002937 +
> > >   "head:X-KMail-MDN-Sent"             262  0.014395  0.000042  0.002937 +
> > >   "head:X-KMail-SignatureState"       262  0.014395  0.000042  0.002937 +
> > >   "head:X-Status"                     262  0.014395  0.000042  0.002937 +
> > >   "head:ASHT"                           1  0.000057  0.000000  0.009094 +
> > >   "head:cookie"                         1  0.000057  0.000000  0.009094 +
> > >   "rcvd:c12.groups.msn.com"             1  0.000057  0.000000  0.009094 +
> > >   "head:Server"                        84  0.003957  0.000057  0.014339 +
> > >   "head:http"                        3782  0.173596  0.002875  0.016297 +
> > >   "head:Status"                       550  0.016631  0.000990  0.056210 +
> > >   "head:Mail"                         652  0.018294  0.001268  0.064843 +
> > >   "head:Performance"                    2  0.000057  0.000004  0.066313 +
> > >   "head:X-Server"                       2  0.000057  0.000004  0.066313 +
> > >   "head:surgemail.com"                  2  0.000057  0.000004  0.066313 +
> > >   "rcvd:SMTPSVC"                     3950  0.096519  0.008634  0.082112 +
> > >   "rcvd:Microsoft"                   3948  0.096404  0.008634  0.082201 +
> > >   "head:Apr"                          161  0.003498  0.000381  0.098227 +
> > >   "head:us-ascii"                   11023  0.164994  0.031025  0.158275 -
> > >   "head:X-User"                         4  0.000057  0.000011  0.167700 -
> > >   "head:Content-Transfer-Encoding"   48462  0.595917  0.144997  0.195700
> > > - "rcvd:with"                       45772  0.555313  0.137448  0.198407 -
> > > "head:charset"                    49550  0.590010  0.149533  0.202197 -
> > > "rcvd:mustang.sdc.com.au"          4193  0.048632  0.012740  0.207584 -
> > > "head:bit"                        45510  0.522338  0.138640  0.209751 -
> > > "one"                             51375  0.585938  0.156754  0.211062 -
> > > "head:plain"                      42478  0.466938  0.130772  0.218788 -
> > > "rcvd:SMTP"                       16245  0.176636  0.050140  0.221100 -
> > > "head:text"                       50496  0.528531  0.157219  0.229266 -
> > > "head:From"                        1965  0.019212  0.006208  0.244220 -
> > > "rcvd:Fri"                        10516  0.090153  0.034064  0.274230 -
> > > "rcvd:from"                       73690  0.621437  0.239385  0.278089 -
> > > "head:Content-Type"              121137  0.920227  0.400249  0.303110 -
> > > "url:85"                            244  0.001835  0.000807  0.305556 -
> > > "head:High"                         348  0.002466  0.001162  0.320224 -
> > > "rcvd:pickup"                       498  0.003498  0.001664  0.322390 -
> > > "rcvd:service"                      501  0.003498  0.001676  0.323887 -
> > > "are"                             84056  0.556919  0.283150  0.337056 -
> > > "rcvd:mail"                         568  0.003670  0.001920  0.343399 -
> > > "head:Fri"                          385  0.002409  0.001306  0.351647 -
> > > "the"                            158589  0.889660  0.544919  0.379846 -
> > > "and"                            156941  0.832483  0.542439  0.394524 -
> > > "head:Date"                      122391  0.602397  0.426132  0.414312 -
> > > "head:Message-ID"                108546  0.531800  0.378091  0.415534 -
> > > "rcvd:Apr"                         4258  0.020416  0.014861  0.421264 -
> > > "head:X-Mailer"                   96664  0.345013  0.345242  0.500165 -
> > > "head:rootsquest.com"                 0  0.000000  0.000000  0.520000 -
> > > "rcvd:n0d915383632d4"                 0  0.000000  0.000000  0.520000 -
> > > "rtrn:eddy"                           0  0.000000  0.000000  0.520000 -
> > > "rtrn:rootsquest.com"                 0  0.000000  0.000000  0.520000 -
> > > "url:85.75.79"                        0  0.000000  0.000000  0.520000 -
> > > "url:
> > >
> > >  SPAM-ADDRESS: 85.75.79.173
> > > 
> > > http://www.rulesemporium.com/cgi-bin/uribl.cgi?domain0=85.75.79.173&bl0=0
> > >
> > >   "                    0  0.000000  0.000000  0.520000 -
> > >   "Gate"                              289  0.000688  0.001055  0.605202 -
> > >   "http"                           196856  0.447095  0.720053  0.616934 -
> > >   "to:sdc.com.au"                  240854  0.467626  0.886260  0.654604 -
> > >   "online"                          17982  0.030338  0.066471  0.686623 -
> > >   "to:anonymous"                    23739  0.022252  0.088935  0.799871 -
> > >   "trusted"                           742  0.000688  0.002780  0.801579 -
> > >   "to:scldad"                      117239  0.106211  0.439462  0.805358 -
> > >   "largest"                          3764  0.001262  0.014252  0.918670 +
> > >   "head:eddy"                           1  0.000000  0.000004  0.991605 +
> > >   "url:85.75"                           2  0.000000  0.000008  0.995766 +
> > >   "from:rootsquest.com"                 4  0.000000  0.000015  0.997873 +
> > >   "
> > >
> > >  SPAM-ADDRESS: grandonliencasino.com
> > > 
> > > http://www.rulesemporium.com/cgi-bin/uribl.cgi?domain0=grandonliencasino.
> > >com&bl0=0
> > >
> > >   "               4  0.000000  0.000015  0.997873 +
> > >   "mostt"                               4  0.000000  0.000015  0.997873 +
> > >   "from:eddy"                          17  0.000000  0.000065  0.999498 +
> > >   "casino_bonus"                       42  0.000000  0.000160  0.999797 +
> > >   "subj:casino_bonus"                  42  0.000000  0.000160  0.999797 +
> > >   "from:Inman"                         60  0.000000  0.000229  0.999858 +
> > >   "from:Candace"                       79  0.000000  0.000301  0.999892 +
> > >   "casinos"                           505  0.000000  0.001923  0.999983 +
> > >   "Golden"                            658  0.000000  0.002506  0.999987 +
> > >   "casino"                           2956  0.000000  0.011258  0.999997 +
> > >   N_P_Q_S_s_x_md                       31  0.000000  0.000000  0.500000
> > >                                            0.017800  0.520000  0.375000
> >
> > _______________________________________________
> > Bogofilter mailing list
> > Bogofilter at bogofilter.org
> > http://www.bogofilter.org/mailman/listinfo/bogofilter
> 
> -- 
> ========================================================================
> This email is for the person(s) identified above, and is confidential to
> the sender and the person(s).  No one else is authorised to use or
> disseminate this email or its contents.
> 
> Stephen Davies Consulting                            Voice: 08-8177 1595
> Adelaide, South Australia.                             Fax: 08-8177 0133
> Computing & Network solutions.                       Mobile:0403 0405 83
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter




More information about the Bogofilter mailing list