The concept of using lex to parse comments and html tags out of html...
David Relson
relson at osagesoftware.com
Mon Feb 17 15:05:47 CET 2003
Nick,
You've been quietly busy :-) I was wondering what you were up to. Now I know.
A quick test with a randomly chosen hunk of html show that the tokenizer
does _something_. Now it's time to evaluate and learn more about what it
really does.
By the way, since the 40 line definition of YY_INPUT makes it hard to step
through in the debugger, I created a function yyinput(). The code now
looks like:
#define YY_INPUT(buf,result,max_size) result=yyinput(buf,max_size)
int yyinput(char *buf, int max_size)
{
... [your code, without the trailing backslashes] ..
return result;
}
Let me spend some more time with this. I'll try merging it into a test
bogofilter and see what happens.
Good Work!!!!
David
More information about the Bogofilter
mailing list