New (?) idea to optimize database
Boris 'pi' Piwinger
3.14 at piology.org
Sat Mar 18 16:04:54 CET 2006
Hi!
We had lengthy discussions how to optimize (=minimize) the
database to get best performance. This is why I created
bogominitrain. Now clearly, this will also collect useless
tokens. Now here is the idea to improve:
Do bogominitrain, remove all tokens which show up only once
in the training body (to do so, full training is needed in
a separate body). Also prevent those tokens from being added
again and do bogominitrain again. Repeat until is converged.
Clearly extremely expensive and I have no real idea how to
implement, but it should give a real powerful database.
pi
More information about the Bogofilter
mailing list