invalid html warfare
    Simon Huggins 
    huggie at earth.li
       
    Wed May 28 12:53:01 CEST 2003
    
    
  
On Wed, May 28, 2003 at 11:13:22AM +0100, Peter Bishop wrote:
> With regard to junk words, perhaps we could define heuristics for
> detecting them. One possibile test is a sequence of 5 non-vowels in a
> token.
> bogoutil -d ~/.bogofilter/spamlist.db | \
> grep -P [^-_.aeiouy]{5}\w*\ 1$  | wc -l
Firstly I promptly checked the strengths of synthetic junkwords as
a possible subsystem of bogofilter.
Rightly worthwhile analysts would grep /usr/share/dict/words to avoid
a synchronized psychotic lynching before claiming such apocryphal
results so lightly.
Simon.
Postscript: count the number of real words with five non-vowels in this
email - apologies for the forced wording :)
-- 
oOoOo   "AAAhhh, I see you're using the Machine that goes Bing."   oOoOo
 oOoOo                                                            oOoOo
  oOoOo                                                          oOoOo
    
    
More information about the bogofilter
mailing list