Naive Bayes classifier derived from bogofilter-0.7

Scott Lenser slenser at cs.cmu.edu
Thu Dec 5 02:25:37 CET 2002


> Scott Lenser <slenser at cs.cmu.edu> writes:
> 
> > I get a lot of the stupid gmime-WARNING messages as well.  I usually just
> > redirect stderr to /dev/null to ignore them.
> 
> Are any of these hints to serious trouble? These glib/gtk+/gmime console
> output shattering warnings would IMHO preclude this software from being
> used. We cannot afford libraries tampering with our output, and
> libraries should not print unless explicitly requested.
> 

I believe the output messages are redirectable and come out on stderr on default.
I just didn't bother sending them somewhere else.

> > if the email includes quoted parts that used to be mime encodings, I'll end
> > up encoding a whole bunch of "words" out of the base64 encoded cruft.  Basically
> > messages like
> >
> >> <base64 stuff>
> >> <base64 stuff>
> >> <base64 stuff>
> >
> > will cause it to take a long time on that message.  I've never seen it not terminate
> > but sometimes it takes a long time.  You should be able to fix that particular
> > problem by putting in a base64 encoding filter in lexer_text_plain.l and lexer_text_html.l.
> > The current on in bogofilter-0.9 is suitable if you remove the ^ from the beginning
> > (and maybe the $ from the end but probably not needed).
> 
> So if we have a performance issue, then gmime is the wrong library to
> use. I still haven't looked at the other code, too busy.
> 
> -- 
> Matthias Andree
> 

I this isn't a problem with libgmime but rather more with my code.  The issue is that I remove
the base64 filter because gmime usually parses all of this stuff.  The problem is that some
mailers quote mime messages incorrectly and therefore the base64 stuff comes through.  The
slow part is adding the counts for all of the little tokens in the long base64 section.
I have since added back in a base64 filter to remove this stuff and this fixes the performance
problem.

- Scott






More information about the bogofilter-dev mailing list