training tactics

Boris 'pi' Piwinger 3.14 at piology.org
Thu May 19 13:56:31 CEST 2005


Kevin Williams said:
> My setup currently puts ham and unsure in the regular inbox and spam
> in the junk subfolder for each user.  Every day when I check my mail,
> the spam that gets through to the inbox is moved into the trash folder
> by me, manually through my email client.  And ofcourse, the read mail
> goes into the trash.

So you put spam and ham in the trash folder.

> I have a cron event that retrains bogo every day at around 3am.
>
> What are the drawbacks of re-training bogofilter every day like I do?
> i.e. running with' -s < [junk folder]' and '-n <[trash folder]'.

You train with any error which is still (unrecognized) in the junk folder
and train on the trash folder which contains, as you explained above,
spam and ham.

I don't understand why you want to train in the first place, but
that is a different story.

> I did notice somthing in the documentation about keyword ages stored
> in the db.  I understand that since i retrain evey day, then the
> keywords would always be new(<24 hours).  However, I'm not sure if
> there is any drawback to this.

Only those keywords are fresh which have been trained that day.

> If there are serious drawbacks to the way I do it now, what are some
> favorable bogo training scenarios as new spam and ham comes in?  I'd
> prefer to have somthing automated and somthing that is doable from
> within the popular mail readers for my users(outlook, horde webmail,
> any imap client really).

What you really need are folders which contain for sure (i.e., user
checked) only spam and others for ham. With those you can train and
then empty them (proably move content to another place).

pi



More information about the Bogofilter mailing list