Filters that Fight Back

Peter Bishop pgb at adelard.com
Mon Aug 11 16:08:46 CEST 2003


On 11 Aug 2003 at 15:16, Matthias Andree wrote:

> So should we drop the "minimum token size" limit to deal with " B R O K
> E N   U P " tokens?
> 

Or should the tokeniser treat a sequence space-separated single letters
as a single token? e.g.:

B R O K E N   U P

is tokenised as:

B-R-O-K-E-N
U-P
-- 
Peter Bishop 
pgb at adelard.com
pgb at csr.city.ac.uk






More information about the Bogofilter mailing list