token pairs [was: Algorithm limitations]

David Relson relson at osagesoftware.com
Wed Apr 14 02:32:58 CEST 2004


On Tue, 13 Apr 2004 14:12:21 +0200
Boris 'pi' Piwinger wrote:

> David Relson wrote:
> 
> > I'm not willing to include word pairs until after the 1.0 release,
> > but am willing to let users experiment with the technique.  Attached
> > is a patch from a couple of months ago and updated to work with
> > 0.17.5. Below is a sample of the output using it:
> > 
> > [relson at osage src]$ echo this is a test of word pairs | bogofilter
> > -C -H-vvv
> 
> > [relson at osage src]$ echo this is a test of word pairs | bogofilter
> > -C -H-vvv -P
> 
> >From that  I understand that you need to call -P to make use
> of the feature. Could you or someone else please give a
> brief explanation which pairs are chosen? Is it only
> adjacent tokens (in your example the short words are not
> tokens) or can you jump over a word? The example output
> suggests that this does not happen.
> 
> Can you do instead of -P a config file option?

The patch below with give you "word-pairs=yes/no" ...

--- bogoconfig.c	18 Mar 2004 21:05:56 -0000	1.170
+++ bogoconfig.c	14 Apr 2004 00:31:38 -0000
@@ -108,6 +108,7 @@
     { "use-syslog",			N, 0, 'l' },
     { "register-ham",			N, 0, 'n' },
     { "passthrough",			N, 0, 'p' },
+    { "word-pairs",			R, 0, 'P' },
     { "register-spam",			N, 0, 's' },
     { "update-as-classed",		N, 0, 'u' },
     { "timestamp-date",			N, 0, 'y' },




More information about the Bogofilter mailing list