Please review batch mode patch from Mike Tillberg, SF #648494

Sat Dec 7 05:37:09 CET 2002

Matthias,

A couple of quick reactions to batch processing ...

First, we just removed code that (incorrectly) complained of multiple 
messages in a file when a second "^From " line was encountered in the 
message body.  If we assume that the input file has body "^From " lines 
properly escaped, then it should work.  If the lines aren't escaped, then 
the code will classify more parts than it should.

Second, the current passthrough code stores the message in textend structs, 
and never frees it.  The memory is left to the operating system to clean 
up.  The new code does more of the same.  Do we accept the memory usage or not?

Third, do we want this feature?  Is there a need/demand?

I'll look at the patch more this weekend, time permitting.

David

At 10:52 PM 12/6/02, Matthias Andree wrote:
>Hi,
>
>find attached Mike Tillberg's patch rediffed to unified diff and
>adjusted to current CVS. Please comment. Original submission is at
>http://sourceforge.net/tracker/index.php?func=detail&aid=648494&group_id=62265&atid=499999
>
>His original description from the tracker:
>
>| This patch allows bogofilter to process multiple
>| messages in -p mode. I haven't tested it in -u mode,
>| but since it goes through all the DB access routines
>| for each message, it should work. It's a quick hack,
>| and I haven't thought any implications so use at your
>| own risk. The patch makes two changes:
>|
>| bogofilter() takes a boolean pointer to return the
>| continuation bool from collect_words, and main wraps
>| everything after initialization in a do while loop.
>|
>| lexer.l provides a reset function that clears out the
>| textblock list is maintains to output the original
>| message. The prevents the messages from piling up and
>| being output multiple times.