bogolexer

David Relson relson at osagesoftware.com
Tue Oct 29 21:11:38 CET 2002


The test program for bogofilter's lexer, i.e. lexertest.c and lexertest, 
has been renamed to bogolexer.c and bogolexer.

I have just implemented a "-p" (passthrough) option for bogolexer and a 
"-p" (probability) option for bogutil.  When used together, you can quickly 
and easily find the spamness, hamness, and spam probability of a set of words.

Here's a quick example:

*** command ***

     echo the quick brown fox jumped over the lazy dog | bogolexer -p | 
bogoutil -p -w /var/lib/bogofilter


*** output ***

                            spam   good   prob
     the                    4333  78312 0.298230
     quick                    70   1259 0.299248
     brown                    11    223 0.274765
     fox                       8     50 0.551347
     jumped                   10     64 0.545474
     over                    421   4450 0.420839
     the                    4333  78312 0.298230
     lazy                      3    138 0.143080
     dog                      10    135 0.362624

With judicious use of the sort and uniq commands, the output can be ordered 
alphabetically or by probability.  For example, adding " | sort +3n" at the 
end of the previous command will display the words in order of increasing 
probability.

David





More information about the Bogofilter mailing list