Database Size versus Shannon's Word Entropy

Rick van Rein rick at openfortress.nl
Wed Oct 25 15:58:39 CEST 2017


Hi Matthias,

You mentioned that the write volume would increase when last-read was
recorded; but in update mode you would write anyway.

Also, are you aware that statistics has an elegant method of gradually
and continuously forgetting things?
http://mathworld.wolfram.com/ExponentialDistribution.html
It' like specifying the count and date all in one float :)


> What goal are you trying to achieve by
> receiver-extension specific filtering?

In terms of user facilitation:
Grouping related activities together, protecting privacy by keeping them
separate and having independent ACLs on each.

In terms of Bogofilter delivering to multiple recipients:
Use of the term scoring to figure out what alias would be the most
likely recipient for a message.  So, not spamfiltering, but
subclassification of the content on the non-spam side.

> I wonder if there are other (than bogofilter) classifiers that support
> more than the spam/ham + unsure targets.
>
> CRM114 used to be something, not sure if it does that and if it's still
> maintained.

Thanks for the pointer; I also have no idea (and am really just playing
with the scope of facilities that this filtering technology could
achieve).  It might just fit very nicely in our architecture (we are
designing a more powerful open source software stack for hosting providers).


Thanks for the responses :)


-Rick


More information about the bogofilter mailing list