understanding bogofilter

Greg Louis glouis at dynamicro.on.ca
Tue May 6 16:36:40 CEST 2003

On 20030505 (Mon) at 2143:21 -0400, David Relson wrote:
> At 09:01 PM 5/5/03, Jon Reynolds wrote:
> >I have been reading everything I can about bogofilter and how to set it
> >up. I want to be able to use it at the server level and train it from
> >there.
> >
> >1. I have virtual domains(VMailMgr+qmail+squirrelmail) and am wondering
> >if one setup using spam and ham from all domains in one corpus would
> >work for all domains or does there have to be a seperate corpus for each
> >domain?
> It can be done.  However since each user has his own definition of spam, a 
> single wordlist probably won't work well.

Seconded.  The ideal is for each user to have an individual corpus;
however, bogofilter can be made to work quite well for a group of users
with more or less common interests (one domain, typically).  Once you
get into highly disparate user groups (multiple domains) you're likely
to need at least one training db each -- that's what I found, anyway.

> >4. I want to setup an new mailbox called "spam" and have my users
> >forward their spam to that address. Is it ok to forward mail or does it
> >have to be bounced?
> It's O.K., though forwarding will change the headers some.

Be careful when training.  Users who forward may find their legitimate
outgoing mail starts to get flagged as spam, unless you're careful to
balance their forwarded messages by entering mails from them in the
goodlist.  (Happened to one of my users; easy to cure, but exciting
while it lasted -- he was Da Boss;)

> >6. And finally... If I do set it up for server level filtering will I
> >have to check fo false positives for everyone personally or is it still
> >sent to the user?
> Again, that's the job of the MDA.  One idea is to deliver spam and let 
> users filter it out (for which "X-Bogosity: Yes" works nicely).  Another 
> idea is to quarantine spam and give users an opportunity to view From and 
> Subject information.

Just as a suggestion: I was checking personally, but it got to be much
too big a job, so I created individual spam quarantines, wrote a cron
job to run every night that purges messages more than 7 days old from
each spam quarantine mailbox, and told the users they could check for
false positives themselves.  (This also provides them with an incentive
to use the nonspam account :)

| G r e g  L o u i s          | gpg public key: finger     |
|   http://www.bgl.nu/~glouis |   glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |

More information about the Bogofilter mailing list