info about spam messages
Chris Wilkes
cwilkes-bf at ladro.com
Fri Jun 11 16:33:33 CEST 2004
On Fri, Jun 11, 2004 at 09:54:04AM -0400, David Relson wrote:
>
> Question: How well does bogofilter's text parsing work with Turkish?
On a related note I started to wonder about what Korean or Chinese or
any other language that uses glyphs spam looks like.
In languages that are ascii-based (how's that for rewritting history?)
there's a lot of spam with words that are slightly misspelled or done in
elite hacker speak:
via-gra
v1agr4
etc. Is there the same beast in glyph based languages? Can one have a
character that looks like a real word/phrase? Are there nonsense words
like "gra"?
I suppose with unicode you can't have the top half of one glyph as a
token and the bottom half as the second glyph (thus when reading you're
mashing the two together to form one glypg) as there isn't a unicode
entry for that, right?
Chris
More information about the Bogofilter
mailing list