further advice for asian spam and spam assassin text
David Relson
relson at osagesoftware.com
Wed Sep 24 17:49:28 CEST 2003
On Wed, 24 Sep 2003 17:13:48 +0200
Boris 'pi' Piwinger <3.14 at logic.univie.ac.at> wrote:
> > ----- Forwarded message from XAEvxzl at iris.seed.net.tw -----
>
> Please do *not* forware spam to this list. It pollutes the
> database.
pete,
What I received looked like "?M???S??(????30??1000??)????????". As
question marks aren't accepted by bogofilter's parser as part of a
token, this parses as (roughly), "M", "S", "30", "1000". None of these
are valid tokens because they're too short or all numeric. So the
forwarded message looks pretty darn harmless.
When I want to include a spam message or mailbox, I gzip it knowing that
the binary encoded attachment will not bother bogofilter.
>
> > Subject: *****SPAM***** ¥þ³¡¥X²M
> ~~~~~~~~
>
> This could be used (eight non-ASCII characters in a row).
>
> > ?M???S??(????30??1000??)????????
> [...]
>
> This is pretty much what happens without charset declaration.
... might be readable by someone whose default charset is the same as
the incoming (undeclared) text. A spammer who sends undeclared chinese
to someone expecting japanese has failed to provide a readable message.
If Darwin is right, they won't survive in the spam business.
Peace,
David
More information about the Bogofilter
mailing list