[PATCH] Better tagging.
Matthias Andree
matthias.andree at gmx.de
Mon Sep 15 01:33:29 CEST 2003
David Relson <relson at osagesoftware.com> writes:
>> True. My idea would then be to use the traditional "return TOKEN"
>> approach, but extend it to add the h: tag when we're in header mode
>> (as opposed to body mode).
>
> Given the sensitivity of the parser to changes in the rules, I've opted
> for the conservative approach, i.e. parse with new and old rules,
> identify differences, revise to avoid problems, then (and only then) do
> the big test to quantify the value of the changes.
That's fine.
> In the patch, the "h:" tag is set by calling set_tag("Head") for each
> newline. Interesting lines call set_tag() which replaces the "h:" tag.
>
> Are you suggesting that "charset=us-ascii" should produce "h:charset"
> and "h:us-ascii" ??
Yes, I am. It shouldn't replace h:"us-ascii" though. We're not yet
scoring word pairs as compounds or using conditional probabilities.
--
Matthias Andree
Encrypt your mail: my GnuPG key ID is 0x052E7D95
More information about the bogofilter-dev
mailing list