odd missing word in R list

Scott Lenser slenser at cs.cmu.edu
Fri Dec 13 19:41:48 CET 2002


> > My bogofilter did not include the token "selected" in the R table.  I
> > was somewhat astonished, so I pulled cvs as of 1629 UTC today, compiled
> > it, ran ./bogofilter -R <file_with_just_this_message -- sure 'nuff, no
> > line for "selected."
> > 
> > Anybody know how come?
> 
> "selected" is one of the words that lexer.l purposely ignores.  i think this
> came from esr's version, he purposely ignored many words commonly found
> in html.  ("selected" can appear in <option> lines).
> 
> see lexer.l starting around line 80 for the list of words.  perhaps
> this should be documented?
> -- 
> Allyn Fratkin             allyn at fratkin.com
> Escondido, CA             http://www.fratkin.com/
> 

Unless something has changed it gets even odder (this is from an old version but I
think it is still true).  Word that contain an html tag word come out funny.

call => c (gets ignored, too short)
resample => re, ple => ple

etc.

- Scott






More information about the bogofilter-dev mailing list