invalid html warfare

Simon Huggins huggie at earth.li
Wed May 28 12:58:19 CEST 2003


On Wed, May 28, 2003 at 11:53:01AM +0100, Simon Huggins wrote:
> On Wed, May 28, 2003 at 11:13:22AM +0100, Peter Bishop wrote:
> > With regard to junk words, perhaps we could define heuristics for
> > detecting them. One possibile test is a sequence of 5 non-vowels in a
> > token.
> > bogoutil -d ~/.bogofilter/spamlist.db | \
> > grep -P [^-_.aeiouy]{5}\w*\ 1$  | wc -l
                      ^
		      Oh bah, you included y.  I still see 32 words in
		      /usr/share/dict/words on this Debian system with a
		      string of 5 non-vowels without y too though.

Simon.

-- 
oOoOo             "ACT NORMAL!  ACT NORMAL!!" - Homer              oOoOo
 oOoOo                                                            oOoOo
  oOoOo                                                          oOoOo




More information about the Bogofilter mailing list