bogolexer
David Relson
relson at osagesoftware.com
Tue Oct 29 21:11:38 CET 2002
The test program for bogofilter's lexer, i.e. lexertest.c and lexertest,
has been renamed to bogolexer.c and bogolexer.
I have just implemented a "-p" (passthrough) option for bogolexer and a
"-p" (probability) option for bogutil. When used together, you can quickly
and easily find the spamness, hamness, and spam probability of a set of words.
Here's a quick example:
*** command ***
echo the quick brown fox jumped over the lazy dog | bogolexer -p |
bogoutil -p -w /var/lib/bogofilter
*** output ***
spam good prob
the 4333 78312 0.298230
quick 70 1259 0.299248
brown 11 223 0.274765
fox 8 50 0.551347
jumped 10 64 0.545474
over 421 4450 0.420839
the 4333 78312 0.298230
lazy 3 138 0.143080
dog 10 135 0.362624
With judicious use of the sort and uniq commands, the output can be ordered
alphabetically or by probability. For example, adding " | sort +3n" at the
end of the previous command will display the words in order of increasing
probability.
David
More information about the Bogofilter
mailing list