decoding implementation

Gyepi SAM gyepi at praxis-sw.com
Sat Nov 23 06:13:56 CET 2002


I am looking for suggestions on adding 
base64 and quoted-printable decoding to bogofilter.

There are two issues I'd like to discuss:

1. Data representation. Should we modify the [Content-Transfer-Encoding]
   headers of a message after
   it has been decoded (for consistency and truthfulness) or should we
   leave the headers alone (preserve information)

2. Data flow. We need to decode the email without necessarily reading the entire email into memory. (I know -p does). The options include:

 a. decode data into a tmp file,rewind, and pass the filehandle to lexer.c
 b. fork and use pipe() to connect std(in|out) of components. conceivably, we'd have a pipeline equivalent to (cat mail.txt|base64decode|qpdecode|bogofilter)
 c. write small programs to implement the actual pipeline
 d. use coroutines
 others?

case a: slow but reliable (have to be careful about file perms and race conditions)

case b: OK
case c: more unix-like and simpler extension of case b.
case d: more elegant but harder.

Comments, votes, discussion?

-Gyepi



More information about the bogofilter-dev mailing list