Month Abbreviations as Stopwords

Suzanne Skinner tril at igs.net
Mon Jan 13 05:52:12 CET 2003


David Relson wrote:

> Evidentally, you've not been looking at the mime branch of development 
> (where the new mime parsing code presently exists).

Er, nope. I actually hadn't noticed that a branch had occurred in CVS.

> The current processing of html discards all text within tags.

Hmm...is that going to be made an option? Stuff like the FF0000 in
'font color=#FF0000' can be really useful as spam indicators :-)

> A project that has come up from time to time is to implement an "ignore"
> list, i.e. a list of words that should be ignored when scoring
> messages.

Maybe separate ignore lists for each of header, text body, and HTML body?

> The idea was to have the list be easily maintainable by a
> user.  Using a plain text list would allow maintenance with any old text
> editor.  If you're looking for a project, I can send you a partially
> completed version of an ignore list implementation :-)

Send away!

Suzanne

-- 
tril at igs.net - http://www.igs.net/~tril/

A Pope has a Water Cannon.                               It is a Water Cannon.
He fires Holy-Water from it.                        It is a Holy-Water Cannon.
He Blesses it.                                 It is a Holy Holy-Water Cannon.
He Blesses the Hell out of it.          It is a Wholly Holy Holy-Water Cannon.
He has it pierced.                It is a Holey Wholly Holy Holy-Water Cannon.
Batman and Robin arrive.                                       He shoots them.
                                    -- Principia Discordia




More information about the bogofilter-dev mailing list