html comment processing

Greg Louis glouis at dynamicro.on.ca
Sun Mar 30 14:58:55 CEST 2003


On 20030329 (Sat) at 2057:50 -0500, David Relson wrote:
> 
> For the html purists, I propose to add a config file option named 
> "strict_comment".  A value of "true" will cause bogofilter to follow the 
> standard and a value of "false" will work as described above.  The default 
> value will be "false".
> 
It would be comforting to know how well the loose interpretation works
before releasing it, IMHO.  That is, to run some actual experimentation
and make sure there's an improvement.  I've got two more s/mindev scans
going at present, but after that I can find clock cycles for this
purpose.  We'd want test corpora that can tell us two things:

1.  Does loose comment processing catch significantly more spam?
2.  Does loose comment processing introduce more risk of missing valid
    tokens?  (I don't see why it should, but data are better than
    theories without data.)

-- 
| G r e g  L o u i s          | gpg public key: finger     |
|   http://www.bgl.nu/~glouis |   glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |




More information about the Bogofilter mailing list