Markup.

michael at optusnet.com.au michael at optusnet.com.au
Sat May 10 07:03:37 CEST 2003


David Relson <relson at osagesoftware.com> writes:
> Michael,
> 
> Nice results!  It looks like your additional symbols _are_ of value.
> 
> I'll see about adding your changes to bogofilter.  If you don't mind,
> I'll call the option "html_markup" and create tokens in form
> "html:comment:4".

It might be an idea to leave it at just 'markup'. I know
that I started with just the html tags, but the next step
is to do things like notice if the subject line has
extended whitespace, or the email address, etc etc. Things
that don't have much to do with html.

The other thing I struggled with slightly was being able to
insert tokens when the message ends. I wasn't able to find
some place that noticed the end of an email that wasn't
a reset point.

(What I'm looking to do here is collect statistics of the course
of an email, and at the end check them an insert appropriate tokens.
Didn't seem easily do-able tho).

> Like you, I wouldn't worry too much about it.  The benefits seem
> pretty clear and there's always the occasional message that's
> virtually impossible to classify - even for a human.  I see some
> computer related messages, for example WinXPnews and TigerDirect, that

Oh, is WinXPnews spam? I've been called it ham! *sigh*.
(I'm using spam in this sense to mean "any auto-generated
email that the user didn't ask for").

> would be ham if directed to me.  For whatever reason, they're sent to
> my 10 yr old.  Because of that, I classify them as spam.
> 




More information about the Bogofilter mailing list