Bogofilter and reclassifying

David Relson relson at osagesoftware.com
Fri Dec 5 13:59:51 CET 2003


On Fri, 5 Dec 2003 12:45:20 +0000
Stroller <Linux.Luser at myrealbox.com> wrote:

> 
> On Dec 5, 2003, at 9:44 am, Nathaniel wrote:
> >
> > Is this correct?  Most howtos I've seen either recreate a wordlist
> > or just
> > mark as spam/ham the entire corpus, but I didn't want to maintain
> > large corpuses and wanted something fairly efficient...
> 
> Please find attached a shell script I'm currently working on - it
> scans Maildir folders for new messages & calls bogofilter to add their
> 
> contents to the wordlist based on whether they're in a spam 
> (~/.Maildir/.Junk.Definite) or ham folder. I hope you might find this 
> useful - you could easily add a line to delete old messages once 
> they've been read - I intend to tar up the spam as part of the
> process, and to run this as part of a cron job. I'll have mailfilter
> set to drop messages with a high bogosity into a
> ~/.Maildir/.Junk.Probable folder, and so all I'll need to do is peruse
> them myself & drop them into ~/.Maildir/.Junk.Definite to confirm
> their spamicity & train Bogofilter further based on their contents.
> 
> I think you should find this script self-explanatory, but please feel 
> free to ask any questions, or since I'm new to shell programming, any 
> suggestions. This script works here, but the usual disclaimers 
> (provided as is, no warranty, back-up your data, yadayadayada) apply.
> 
> Stroller.

Hi Joe,

I took a quick look at your script and it looks fine.  It's nicely
readable and looks like it should work.  Good job.

I did find a couple of details that you might want to change.  First,
"-W" isn't needed as that's bogofilter's default mode.  Second, since
bogofilter doesn't presently care whether its running with combined or
separate wordlists, those checks aren't necessary.  Of course the
separate wordlist code is scheduled to be deprecated in 0.16.0 and
removed in 0.17.0 so using a single, combined wordlist is the right
thing to do.  Lastly, the "exit 0" statements are unnecessary as they're
in places where nothing more happens in the script.  If they're deleted,
the code will just fall through to the end of the script and exit
normally.

David




More information about the Bogofilter mailing list