What has become of buff and word and fgetsl?
Matthias Andree
matthias.andree at gmx.de
Tue Feb 25 02:17:51 CET 2003
David Relson <relson at osagesoftware.com> writes:
> Bogofilter has a need for adding new text to a partially filled
> buffer. That ability is used in get_decoded_line() and in
> process_html_comments(). The buff_shrink() and buff_expand() routines
> are used to change pointers to allow this to happen.
But these come in pairs.
> I think that it can be coded differently by storing additional
> information in the buff_t struct. The "read" variable is a start
> towards that goal. However it is not yet implemented.
It's a relief to read that.
> Also there are times when bogofilter has a buff_t and calls a function
> that needs a word_t. Rather than create a new word_t from the text and
> leng fields of the buff_t, I found it expedient to include a word_t
> within the buff_t. From an object oriented point of view, the word_t is
> the super class and the buff_t is a subclass.
That's fine.
>>Note that fgetsl wouldn't have to be aware of buff at all, a wrapper
>>could take care of that, but I'm not separating the buff out now.
>
> By providing a more uniform interface using buff_t and word_t lessens
> the need for wrappers. That's a good thing (TM).
As long as it's understandable, yes.
>>My fix doesn't really speed things up though :-(
>
> The fgetsl() fix forced the lexer to process the _whole_ file. Without
> the fix, the lexer was processing approx 20K of 100K. As the lexer's
> time for processing lots of characters doesn't seem to be linear, the
> fgets() fix caused the time to process the test file to increase more
> than five fold.
Yup. And I wonder if memory allocation is the issue here or backing up
in the scanner.
--
Matthias Andree
More information about the bogofilter-dev
mailing list