Spam / ham registration issue

Tom Anderson tanderso at oac-design.com
Wed Mar 3 15:20:50 CET 2004


On Wed, 2004-03-03 at 09:10, Mail Delivery Subsystem wrote:
> <<< 550 Error: "Message rejected because it contains the word 'p3nis'."

That's ridickulous.
> 
> ______________________________________________________________________
On Wed, 2004-03-03 at 08:33, Boris 'pi' Piwinger wrote:
> Increasing n will only make the overall result approach .5
> (so Tom was wrong that it would ever become a significant
> token, it becomes more and more insignificant).

Noted.  I wonder if that's the best approach though.  Despite my "racial
profiling" comment, it seems that repeat offenders ought to be a
consideration.  If scoring is based on percentage of spam in which this
token has occurred, then tokens will be competing for top scoring.  If I
get a bunch of "p3nis enlargment" spams, it shouldn't make "v1agra" any
less spammy, but that seems to be the effect.  Right?

<PI>To continue my analogy to law enforcement... if suddenly some asian
gang starts a crime spree, it doesn't make that black convicted murderer
any less of a criminal.</PI>  <PC>Or the white book-cooker.</PC>  In
other words, stereotyping a whole race is wrong, but keeping a criminal
record on an individual is effective.

In that vein, it'd be nice if the spam/ham ratio on a per-token basis
were taken into account in some way.  The spam/ham ratio on a per-email
basis is unreasonable though.  Innocent until proven guilty, but once
proven guilty, we keep score.

> I always found -Q listed too much. Now thinking about it, I
> think it should print the complete configuration. Best would
> be to do it in a form which is a correct config file. This
> would, e.g., require to write the line with the version
> number after a #.

Good idea.  I like the idea of a valid config file output.

BTW, I would really love if at least spam_subject_tag and
unsure_subject_tag could be included.  I don't know if bogofilter
already strips those out during registration or not, but I do it in
bfproxy based on default values for these.  Having the correct values
would be better.

Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040303/c33cfd91/attachment.sig>


More information about the Bogofilter mailing list