spam with random words

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Mon Jan 12 12:42:09 CET 2004


T. Horsnell (tsh) wrote:

> Since Christmas we have started getting spam which consists
> mainly of a couple of URLs embedded in a stream of random
> dictionary words thus:

Random words don't really hurt, because they are not significant

[example message]

Any reason you put the message here? This will spoil some
people's database.

> These often contain the same (mis-spelled) phrases (e.g. Not intreseted)
> which still get through bogofilter presumably because their contribution
> is drowned by the random stuff.

There is an easy way to find out. Use -vvv to see which
words matter.

> Is there (could there be) some way to
> increase the signifiance of particular phrases?

I don't see a useful phrase. It is the one misspelled word
which could be significant or not.

> And could there be subtle ill-effects of adding such messages to the
> spam training list?

No. Those random words will show up in many messages (good
and bad), so they are only moved slightly to spammish. But
that is the whole idea about statistics.

pi





More information about the Bogofilter mailing list