invalid html warfare
Simon Huggins
huggie at earth.li
Wed May 28 12:53:01 CEST 2003
On Wed, May 28, 2003 at 11:13:22AM +0100, Peter Bishop wrote:
> With regard to junk words, perhaps we could define heuristics for
> detecting them. One possibile test is a sequence of 5 non-vowels in a
> token.
> bogoutil -d ~/.bogofilter/spamlist.db | \
> grep -P [^-_.aeiouy]{5}\w*\ 1$ | wc -l
Firstly I promptly checked the strengths of synthetic junkwords as
a possible subsystem of bogofilter.
Rightly worthwhile analysts would grep /usr/share/dict/words to avoid
a synchronized psychotic lynching before claiming such apocryphal
results so lightly.
Simon.
Postscript: count the number of real words with five non-vowels in this
email - apologies for the forced wording :)
--
oOoOo "AAAhhh, I see you're using the Machine that goes Bing." oOoOo
oOoOo oOoOo
oOoOo oOoOo
More information about the Bogofilter
mailing list