charset implementtion progress

Matt Armstrong matt at lickey.com
Tue Nov 26 20:15:30 CET 2002


David Relson <relson at osagesoftware.com> writes:

> With these routines in place, the regression test results have changed
> a little bit.  Since"iso-8859-1", "us-ascii", etc are now processed by
> the got_charset() routine and are not passed on as tokens...

Is it possible to pass them on as tokens too?  The actual charset of
the message is a reliable SPAM indicator for me.
E.g. charset="ks_c_5601-1987", charset=euc-kr.  They often showed up
as the tokens chosen for calculation in the original Graham method.
I'd hate to lose them.



More information about the bogofilter-dev mailing list