What has become of buff and word and fgetsl?

Tue Feb 25 02:17:51 CET 2003

David Relson <relson at osagesoftware.com> writes:

> Bogofilter has a need for adding new text to a partially filled
> buffer. That ability is used in get_decoded_line() and in
> process_html_comments().  The buff_shrink() and buff_expand() routines
> are used to change pointers to allow this to happen.

But these come in pairs.

> I think that it can be coded differently by storing additional
> information in the buff_t struct.  The "read" variable is a start
> towards that goal.  However it is not yet implemented.

It's a relief to read that.

> Also there are times when bogofilter has a buff_t and calls a function
> that needs a word_t.  Rather than create a new word_t from the text and
> leng fields of the buff_t, I found it expedient to include a word_t
> within the buff_t.  From an object oriented point of view, the word_t is
> the super class and the buff_t is a subclass.

That's fine.

>>Note that fgetsl wouldn't have to be aware of buff at all, a wrapper
>>could take care of that, but I'm not separating the buff out now.
>
> By providing a more uniform interface using buff_t and word_t lessens
> the need for wrappers.  That's a good thing (TM).

As long as it's understandable, yes.

>>My fix doesn't really speed things up though :-(
>
> The fgetsl() fix forced the lexer to process the _whole_ file.  Without
> the fix, the lexer was processing approx 20K of 100K.  As the lexer's
> time for processing lots of characters doesn't seem to be linear, the
> fgets() fix caused the time to process the test file to increase more
> than five fold.

Yup. And I wonder if memory allocation is the issue here or backing up
in the scanner.

-- 
Matthias Andree