singletons
David Relson
relson at osagesoftware.com
Sun Dec 28 03:00:08 CET 2003
On 28 Dec 2003 11:16:29 +1100
michael at optusnet.com.au wrote:
> David Relson <relson at osagesoftware.com> writes:
...[snip]...
> My current version has a patch similar to this, but I used
> a formula like
>
> #define MLOG(a) ((int) (5.0*log( 1.0 + (a) ) )) // 'a' is the raw
> count.
>
> instead. This provided better granularity in the small end
> of the change. (In my testing, the significance
> of '2' is very different from '3')
I like the MLOG idea. 'Tis more elegant. I'll have to give it a try.
> > Token-pairs are pretty easy. A flag in the get_token() routine and
> > remembering the previous token will allow the routine to alternate
> > between returning single tokens and tokens pairs.
>
> Something like the below maybe?? (no option flag in
> this patch though).
>
> (ps: I hand edited this patch; It may not apply cleanly, but
> you get the idea... )
I haven't fully grokked your code, yet. Having a stack seems like
overkill, but I need to examine it more thoroughly.
My thoughts involved caching the current and previous words and having a
state variable that toggles between GET_NEW_WORD and RETURN_WORD_PAIR.
I think that would be enough to have get_token() alternate between
returning a new word and returning a word pair. Changing the tag
(head:, rcvd:, to:, etc) would do an appropriate reset so a word pair
wouldn't be built from (say) the From: and To: lines. I've thought
about this, but haven't implemented it.
More information about the Bogofilter
mailing list