Database persistence

Christophe Journel christophe.journel at gmail.com
Tue Sep 19 20:10:42 CEST 2006


Hello.

I've got a question about the database persistance.

Is it a good idea to erase old token which have been included in the
database before a given date,
instead of using bogotil to erase old tokens.


For intance, the word : Hello
since 2003, i added at least 150 000 times this word as ham.
If tomorrow this word comes into spam, i will certainly have to wait a long
time since a mail is tagged spam.



However, is the following solution usefull :

i dump the database every three month.
and, i compare the numbers between , for example, june and now.

for a given word :

May
 word : hello 150 000  ham - 20 000 spam

Now
word : hello 160 000  ham - 80 000 spam


diff :   10 000 ham - 60 000 spam.

maybe replacing the actual figures with this difference would be efficient ?



Thx.

Christophe



More information about the Bogofilter mailing list