suggestions/requests
David Relson
relson at osagesoftware.com
Thu Jan 23 01:39:10 CET 2003
At 06:16 PM 1/22/03, Matthias Andree wrote:
>We're currently folding everything to lower case, discarding information.
>
> > Use Judy arrays to track the count of all words in each document being
> > checked - at the end generate a pseudo word describing the repetitiveness
> > of the document.
>
>Judy is history for bogofilter. We're using Gyepi's wordhash function,
>it's more portable and similar performance.
For bogofilter's needs the wordhashes perform _significantly_ better than
did Judy.
More information about the bogofilter-dev
mailing list