spam with random words

T. Horsnell (tsh) tsh at mrc-lmb.cam.ac.uk
Mon Jan 12 11:12:13 CET 2004


Hi all,
Since Christmas we have started getting spam which consists
mainly of a couple of URLs embedded in a stream of random
dictionary words thus:

=============================================================================

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>

<TITLE>Message</TITLE>

<META content="MSHTML 6.00.2800.1276" name=GENERATOR></HEAD>
<BODY>
<DIV><!-- Converted from text/plain format --><FONT face=Arial size=2>
<A HREF="http://www.countupandlookaway.com/m2/index.php?AFF_ID=m4">
Hello,<BR>
<BR>
I finally was able to lose the weight I have<BR>
been struggling to lose for years!<BR>
<BR>
And I couldn't believe how simple it was!<BR>
Amazing patch makes you shed the pounds!<BR>
It's Guaranteed to work or your money back!<BR></A>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<a href="http://www.countupandlookaway.com/homepage/">Not intreseted</a><br></FONT></DIV></BODY></HTML>
scold hanna hydrous certitude latus broad monadic embroidery grosset capacious alumina brace leftmost lit bater hackberry andy exponentiate seditious oakley andromeda drafty plague akron parr plaster muscat benight <br>
inlay barrier meniscus caliper hackberry downcast hexagon muscovy hannah demountable hopeful bragging roustabout haifa bunsen berkowitz forgot mythology bluster helena malarial dutiable reredos horology sabbath artisan maneuver bull <br>
beggary forge silicone baseball hershey quip moire madhouse effusion japanese highboy limelight busy maladroit dunlap burmese erodible luxuriant revolutionary saturday cartographer saxophone collet february commentary admissible littleneck quay charleston ophthalmology malton danbury hubby basswood nimh deltoid grapevine <br>
calculus innkeeper ascend holeable handel elbow peal ruthless forestry galaxy contrition beech injunct grape fredrickson mosque guiding plummet quantico broccoli dialup holler furtive rudder edmund craven betroth exercise consort housebroken bellatrix <br>
ionosphere hello freeboot bang casket bunk do cap merriment hitler cofactor doris march portuguese arizona propelling balustrade backscatter heck inspire clifton have crania hatfield heathkit briggs charybdis rockabye gab <br>
beget distributor sagacity breadboard alia apologetic salem bind conceit phosphor shield caviness contrariwise behave progenitor <br>
more rasmussen icosahedral fever buddha depredate peacetime canadian bandpass garth merit betelgeuse quezon doneck <br>
linseed analogue coltsfoot celebrity inaccuracy archfool downgrade fare inelegant dirge napoleon inbreed pipsissewa cute hamster airspeed parentheses abreact exit d'oeuvre <br>
anagram hymn arpa bungalow comedian essay ancestor fillip maternal noun <br>
acquire hymnal casework sci guildhall garry numerous comparative duck hazard homily comic laudanum oilman collier compositor leap collector <br>

=============================================================================


These often contain the same (mis-spelled) phrases (e.g. Not intreseted)
which still get through bogofilter presumably because their contribution
is drowned by the random stuff. Is there (could there be) some way to
increase the signifiance of particular phrases?

And could there be subtle ill-effects of adding such messages to the
spam training list?

Cheers,
Terry






More information about the Bogofilter mailing list