decoding implementation

Clint Adams schizo at debian.org
Tue Nov 26 02:35:04 CET 2002


> Oh yes they will. Microsoft Office and other crap will happily send
> these typographic quotes in documents, and what's even more fun, declare
> ISO-8859-1 for that (rather than Windows-1252 which would be correct.)
> or us-ascii or... Try typing "this is strange" (with quotes) in MS Word
> and see how it changes that to “this is strange”.

Those are legitimate punctuation marks, though, and in theory bogofilter
will someday recognize them as such (assuming the charset is
declared properly), and not treat them as part of "this" or "strange".

If someone does send me mail with the wrong charset, though, I'm going
to yell them to fix their mail software, much as I complain now when
someone sends me HTML mail or raw (non-RFC-2047) ISO-8859-{1,15} in
headers.



More information about the bogofilter-dev mailing list