Tom Anderson tanderso at
Fri Jul 9 14:10:15 CEST 2004

On Fri, 2004-07-09 at 04:12, Andreas Pardeike wrote:
> On 2004-07-09, at 08.58, Peter Bishop wrote:
> > Sure there are real words like that too
> > but if these are split consistently by bogofilter then
> > Mc and Donald
> > would be stored instead, so might be recognised OK
> >
> > - even better when token pairs/sequences
> > are looked for in later versions of bogofilter
> Then what happens tO tExT LiKe tHiS?

I'd imagine it'd be ignored completely since it doesn't meet the minimum
token length.  This isn't actually a terrible idea since it's not very
readable text anyway, and there should be sufficient other tokens to
make the message spammy.  However, perhaps bogofilter could score both
ways... with and without breaking on the case changes.  But now we're
getting more complicated.


