libgmime [was Re: lexer, tokens, and content-types]

Scott Lenser slenser at cs.cmu.edu
Tue Dec 10 20:12:50 CET 2002


> How much ballast (that we don't need, like encoding) do we buy if we use
> it? Can it be compiled as shared library?
> 

The ballast added is basically the ability to create MIME messages which we obviously
don't need.

It works as a shared library.

> > The following types of encoding are handled:
> > 
> > 7bit
> > 8bit
> > binary
> > base64
> > quoted printable
> 
> fine
> 
> > uuencode
> 
> we don't need that. test will not be uuencoded because it would be
> unreadable for netscape then.
> 

I don't understand why we don't need this but I can't find a single
spam that is uuencoded so I'll assume you are right.

> > Support for a large number of RFCs.  From the libgmime README:
> 
> Oh, well. No offense, but when I look at what RFC compliance qmail
> claims and then violates (974, 1652), I'd rather not bet on those
> claims.
> 

Point taken.  I know next to nothing about the RFCs so I just forwarded
along the information I had.

> > Additional features:
> > 
> > error output redirection (uses glib for this, I don't know how it works
> >   exactly)
> 
> We don't want libraries to clutter our output. I've seen dozens of cast
> warnings from GTK+ applications, and from what you quoted, I fear we're
> getting the same junk from gmime applications. I'm loathe to link _my_
> software against libraries that write random junk to a random output
> channel. For production versions, we want to redirect that junk to
> /dev/null or prevent it from being printed unless the user requests
> debug mode.
> 

Obviously, it is a problem if we can't redirect these messages.  I don't think
this will be too hard to do.  How about this, I'll look into it and find out
whether it is a problem or not.

> > parses from, to, etc lines into list of email addresses.  Parses the email
> >   addresses into name sections and email address sections
> > iconv integration.  Supports using iconv to convert character sets.  I
> >   haven't looked at this much
> 
> The iconv part might be interesting; we don't need address parsing now
> however.
> 

I use the address parsing to split the To and Cc lines into separate addresses
and to split the names from the email addresses.  This lets me easily make tokens
like "TO:E:slenser at cs.cmu.edu" and "TO:N:Scott Lenser" which are indicative of ham
for me.  I've also found that people with email addresses that are close to mine
alphabetically that I don't know that also get spam become great spam features.

> Thanks for your having a look.
> 
> -- 
> Matthias Andree

No problem.  I had to do something, spams were getting through ;-).

- Scott






More information about the bogofilter-dev mailing list