Hapax survival over time

Tom Anderson tanderso at oac-design.com
Wed Mar 24 15:14:18 CET 2004


On Wed, 2004-03-24 at 02:17, Boris 'pi' Piwinger wrote:
> And then again put it in the database. This certainly gives
> a wrong value.

But I would expect uninfluential either way.

> >How strong of an indicator could it be if it is seen so
> >infrequently?
> 
> A pretty high one. There are typical mails which don't show
> up most of the time. Examples are: Regular mailing list
> reminders, Easter greetings etc.

If you received a single spam (just one) with the word "Easter" in it
last year, I don't think it's worth keeping that token in your database
for the entire year.  However if you received dozens, then it might be
worthwhile, but then it won't be a hapax.

Tom

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040324/974ac770/attachment.sig>


More information about the Bogofilter mailing list