Database persistence
Christophe Journel
christophe.journel at gmail.com
Tue Sep 19 20:10:42 CEST 2006
Hello.
I've got a question about the database persistance.
Is it a good idea to erase old token which have been included in the
database before a given date,
instead of using bogotil to erase old tokens.
For intance, the word : Hello
since 2003, i added at least 150 000 times this word as ham.
If tomorrow this word comes into spam, i will certainly have to wait a long
time since a mail is tagged spam.
However, is the following solution usefull :
i dump the database every three month.
and, i compare the numbers between , for example, june and now.
for a given word :
May
word : hello 150 000 ham - 20 000 spam
Now
word : hello 160 000 ham - 80 000 spam
diff : 10 000 ham - 60 000 spam.
maybe replacing the actual figures with this difference would be efficient ?
Thx.
Christophe
More information about the Bogofilter
mailing list