The concept of using lex to parse comments and html tags out of html...
Nick Simicich
njs at scifi.squawk.com
Tue Feb 18 03:29:39 CET 2003
At 09:05 AM 2003-02-17 -0500, David Relson wrote:
>Nick,
>
>You've been quietly busy :-) I was wondering what you were up to. Now I
>know.
>
>A quick test with a randomly chosen hunk of html show that the tokenizer
>does _something_. Now it's time to evaluate and learn more about what it
>really does.
>
>By the way, since the 40 line definition of YY_INPUT makes it hard to step
>through in the debugger, I created a function yyinput(). The code now
>looks like:
Why would you need to debug that code? It all worked the first time. At
least all that I tested. I expected it all to work, it is all simple step
through the buffer and stash a little as needed.
>#define YY_INPUT(buf,result,max_size) result=yyinput(buf,max_size)
>int yyinput(char *buf, int max_size)
>{
>... [your code, without the trailing backslashes] ..
>return result;
>}
This should work as well. By the way, all of the code that actually read
from the file was a straight copy from the original YY_INPUT macro. Once
you are through testing, you probably want to de-merge this - you have a
YY_INPUT scheme that already works.
>Let me spend some more time with this. I'll try merging it into a test
>bogofilter and see what happens.
>
>Good Work!!!!
That you may want to see about.
>David
>
>
>
--
SPAM: Trademark for spiced, chopped ham manufactured by Hormel.
spam: Unsolicited, Bulk E-mail, where e-mail can be interpreted generally
to mean electronic messages designed to be read by an individual, and it
can include Usenet, SMS, AIM, etc. But if it is not all three of
Unsolicited, Bulk, and E-mail, it simply is not spam. Misusing the term
plays into the hands of the spammers, since it causes confusion, and
spammers thrive on confusion. Spam is not speech, it is an action, like
theft, or vandalism. If you were not confused, would you patronize a spammer?
Nick Simicich - njs at scifi.squawk.com - http://scifi.squawk.com/njs.html
Stop by and light up the world!
More information about the Bogofilter
mailing list