how many tokens?

Chris Wilkes cwilkes-bf at ladro.com
Wed Feb 26 18:59:52 CET 2003


On Wed, Feb 26, 2003 at 12:43:45PM -0500, David Relson wrote:
>
> No Due Dates<FONT COLOR="fef0d0">zzzzzz</FONT>No Hidden Charges<FONT
> COLOR="#fef0d0">zzzzzz</FONT>No
>
> line 3 - Should "Dates<FONT COLOR="fef0d0">zzzzzz</FONT>No" produce
> two
> tokens ("dates", "zzzzzz") or just one, i.e. "dateszzzzzzno" ?
  
Do you want to keep the FONT tags around?  A lot of spam HTML email has
crazy fonts all over the place and I think a count of them would help
identify spam.
  
Course I'm of the mind that any HTML email I get is highly suspect from
the get-go.  Maybe I should make a pre-filter for my script to run BF so
I can have seperate text and html email file databases and cutoff rules.
Anyone doing that?

Chris




More information about the bogofilter-dev mailing list