Spammers getting more sneaky...

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Wed Jun 25 11:49:44 CEST 2003


Jonathan Hunt wrote:

> I had a VERY sneaky spam arrive this morning that almost got through
> bogofilter.  The only reason it didn't was because it was sent to an
> address that I only ever get spam to, and so the "To:" address was enough
> to pull it above the spam threshold.

Thanks, Jonathan, for sending me that mail. I had no
problem. Here is the relevant part from my scan:

> X-Bogosity: Spam, spamicity=0.810, version=0.13.7/fisher
>                                      n    pgood     pbad      fw     U
> "unpack"                             3  0.000134  0.000000  0.001379 +
> "Content-Length"                     2  0.000089  0.000000  0.002065 +
> "castles"                            1  0.000045  0.000000  0.004109 +
> "Lines"                            369  0.016215  0.000426  0.025627 +
[...]
> "style"                           1050  0.003171  0.069556  0.956387 +
> "daddy"                             15  0.000045  0.000995  0.956661 +
> "offers"                          1455  0.002814  0.098899  0.972329 +
> "size"                            5699  0.010542  0.388135  0.973557 +
> "Miss"                              28  0.000045  0.001918  0.977043 +
> "Visit"                            562  0.000893  0.038508  0.977316 +
> "font-size"                        274  0.000402  0.018828  0.979073 +
> "SIZE"                             591  0.000715  0.040853  0.982797 +
> "Faster"                            92  0.000089  0.006394  0.986159 +
> "href"                            8089  0.005182  0.566465  0.990935 +
> "color"                           5213  0.003037  0.365542  0.991758 +
> "Prescription!"                      1  0.000000  0.000071  0.994208 +
> "nbsp"                            4327  0.001742  0.304654  0.994313 +
> "FFFFFF"                          1047  0.000357  0.073819  0.995177 +
> "cave"                               2  0.000000  0.000142  0.997090 +
> "from:Dionne"                        2  0.000000  0.000142  0.997090 +
> "from:Xiong"                         3  0.000000  0.000213  0.998056 +
> "FF0000"                          1468  0.000089  0.104156  0.999139 +
> "nomore.html"                       12  0.000000  0.000853  0.999513 +
> "Again!"                            16  0.000000  0.001137  0.999635 +
> N_P_Q_S_s_x_md                      24  0.00e+00  6.20e-01  8.10e-01
>                                         1.00e-02  4.15e-01  0.450

Note, that my instance of bogofilter does not know that mail
in training. Nor a sender by that name (the from: hits above
come from different combinations).

What saved me here is the detection of lots of html. So
whoever receives a relevant chunk of legal HTML mail will
have a problem.

What I don't understan is why "Content-Length" is virtually
unknown to my database.

pi





More information about the Bogofilter mailing list