Serious problem with non-ASCII words

Jonathan Buzzard jonathan at buzzard.org.uk
Fri Sep 20 20:26:25 CEST 2002



3.14 at logic.univie.ac.at said:
> Clearly whitespace and line ending are word delimiters. Also
> punctuation. This assumes we have charsets which are compatible with
> ASCII, though. But I don't see how we can do better. How about
> hyphens? 

Well we could try paying attention to the "Content-type" header.
For example the original mail in this thread had a Content-type
header like this

    Content-type: text/plain; charset=ISO-8859-1

And your mail had one like this

    Content-type: text/plain; charset=us-ascii

One would have though from this point it is fairly obvious what to do.
Frankly Bogofilter should handle the same email in the same what
regardless of what locale it happens to be running under at the time.

JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan at buzzard.org.uk
Northumberland, United Kingdom.       Tel: +44(0)1661-832195



For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list