degeneration [was: What's Coming ...]

David Relson relson at osagesoftware.com
Wed Nov 26 20:15:21 CET 2003


On Wed, 26 Nov 2003 17:51:06 -0000
"Peter Bishop" <pgb at adelard.com> wrote:

> Re degeneration, why not keep this as a standard feature?
> 
> This is what Paul Graham does. Degeneration is used as a fallback if
> the precise format of the word is not found. I would have thought this
> was particularly useful for setups where the database is only
> partially updated by incoming spam/ham (e.g. train on error and train
> on uncertain) as it is less likely that a specific word format will be
> stored in the database.
> 
> On 24 Nov 2003 at 19:59, Shawn Grunberger wrote:
> 
> > Number 4 in your remove list is "degeneration code for headers and
> > case sensitivity." 
> > 
> > I'm aware of the -H option, which provides a degeneration option for
> > header tagging. Was anything similar ever added for
> > case-sensitivity? I'm only aware of the -Pi/PI option, which doesn't
> > degenerate, right? 
> > 
> > We have hundreds of customers with case-insensitive databases that
> > can't be rebuilt, so ideally we'd use something like -H for case
> > that preserves accuracy during the transitional phase. 

Hi Peter,

My test of degeneration didn't show any measurable value.  On the basis
of Paul Graham's article and because there were a few requests for it, I
included it anyway.  Given that it was released 4 months ago and its
purpose was to aid transition from case-insensitive to case-sensitive,
it seems to be an idea whose time has passed.

David




More information about the Bogofilter mailing list