The joy of buffer switching....
Nick Simicich
njs at scifi.squawk.com
Mon Feb 24 16:25:29 CET 2003
I have spent several more hours trying to work on moving buffers from
flexer to flexer. I am convinced that the approach will not work. It is
supposed to, but it does not.
The specific issue is this:
(gdb) p *yy_current_buffer
$29 = {yy_input_file = 0x40231ce0,
yy_ch_buf = 0x80e42f0 "\nFrom glamb at dynagen.on.ca Fri Nov 1 06:20:26
2002\n56 at wall.org>\n",
yy_buf_pos = 0x80e42f1 "From glamb at dynagen.on.ca Fri Nov 1 06:20:26
2002\n56 at wall.org>\n", yy_buf_size = 16384, yy_n_chars = 1,
yy_is_our_buffer = 1,
yy_is_interactive = 1, yy_at_bol = 1, yy_fill_buffer = 1,
yy_buffer_status = 1}
(gdb) p yy_n_chars
$30 = 2
(gdb)
We are in the call where the buffer is being extracted. We have moved this
buffer from head-to-text, and that worked. We are moving the buffer
text-to-head. This is the first time we are extracting a buffer from the
plain text flexer in yy_switch_to_buffer(new_buffer). The code that is
about to be executed is going to completely screw up the buffer - it will
overlay the 'o' in from with a null. It has parsed the "From " token out,
and we should be saving the rest of the line for the next state.
I am through working on this for the day. If someone else can come up with
a workable buffer swapping scheme, I will certainly listen, or if someone
can tell me what I am doing - I am still essentially running the patched I
posted yesterday, except I turned off optimization so that I can run gdb
more easily and so that the trace commands work more predictably.
THIS IS SUPPOSED TO WORK, as far as I can tell on the man page. You are
*SUPPOSED* to be able to stash a partially processed buffer, then go off
and do something else with the lexer, then return to the buffer, with the
buffer holding the state for where you are in the input stream. The buffer
swapping is the essence of processing include files. You hide the input
buffer, in your own data structure, switch buffers, and handle the input
associated with the new buffer.
Doing the yy_switch_to_buffer() is supposed to take the variables that are
up in the air inside the lexer and stash them inside the buffer's state
variables. But it just is not working. I spent a while traceing this,
watching it go wrong, until I realized that the yy_switch_to_buffer was
hosing the buffer.
I have two approaches. One is that I should be calling yy_switch_to_buffer
from within the rule rather than from outside. I will try adding the code
to the processing of "From".
If that fails, then I will work on forcing in EOFs and moving detection of
From, mime boundaries and header ends to the code outside of the lexer.
Feeding the lexer artificial EOFs at the end of every section is probably
clean enough to work unconditionally.
--
SPAM: Trademark for spiced, chopped ham manufactured by Hormel.
spam: Unsolicited, Bulk E-mail, where e-mail can be interpreted generally
to mean electronic messages designed to be read by an individual, and it
can include Usenet, SMS, AIM, etc. But if it is not all three of
Unsolicited, Bulk, and E-mail, it simply is not spam. Misusing the term
plays into the hands of the spammers, since it causes confusion, and
spammers thrive on confusion. Spam is not speech, it is an action, like
theft, or vandalism. If you were not confused, would you patronize a spammer?
Nick Simicich - njs at scifi.squawk.com - http://scifi.squawk.com/njs.html
Stop by and light up the world!
More information about the bogofilter-dev
mailing list