testing parsing changes

Sat Nov 8 18:30:52 CET 2003

David Relson <relson at osagesoftware.com> wrote:

>Modification D:
>	Recognize <!DOCTYPE HTML PUBLIC.*> as the beginning of html text.
>
>Modification T:
>	Accept two character tokens, e.g. "AB", "sp", ...

So not exactly my patch which also allows numbers?

>To establish a baseline result, parts 2, 3, and 4 of the spam messages
>were scored using bogofilter's default parameters (spam_cutoff=0.95,
>min_dev=0.100, robs=0.010, robx=0.415).  

Full training that is. Not the way to get most significant
tokens.

>The numbers of false
>negatives are printed for each of the 3 parts, as well as a total
>count.  These numbers provide an indication of how accurately
>bogofilter scores spam (though without an indication of the ham
>scoring).

Without results for ham this doesn't say much. Actually,
false positives are way more important. As I described in my
mail about my 2-byte-token/numeric-token test, most added
tokens (pure training on error) were significant, i.e., they
contribute the calculation. Many of which to the ham side,
and reducing false positives is IMHO even more useful.

So my question for you is: Do you get many significant
tokens?

So most interesting for me is looking at unmodified
parameters (i.e., not using your target). Are both false
positives and false negatives reduced?

Also interesting: What happens if you take you real
parameters, not the standard? This would show what really
happens.

>The more interesting results are found next, using the following
>method: The ham messages are scored and the results are sorted.  A
>target cutoff of 0.25% (of the messages, i.e. 52 for test 1 and 59 for
>test 2) is used to find the cutoff value that gives 0.25% false
>positives.  This cutoff value is then used in scoring the 3 sets of
>spam to see how many of them are scored below the cutoff, i.e. how
>many false negatives occur using the cutoff value.

Also here you don't give false positives. The target is not
guaranteed to work as expected, it can be a bit off due to
several messages with the same score (I have observed that
in tests). Also your target is way to high for my taste (it
is 1 false positive in 400 messages (in my case that would
be every other day!).