singletons

David Relson relson at osagesoftware.com
Sun Dec 28 03:00:08 CET 2003


On 28 Dec 2003 11:16:29 +1100
michael at optusnet.com.au wrote:

> David Relson <relson at osagesoftware.com> writes:

...[snip]...

> My current version has a patch similar to this, but I used
> a formula like 
> 
> #define MLOG(a) ((int) (5.0*log( 1.0 + (a) ) ))    // 'a' is the raw
> count.
> 
> instead. This provided better granularity in the small end
> of the change. (In my testing, the significance
> of '2' is very different from '3')

I like the MLOG idea.  'Tis more elegant.  I'll have to give it a try.

> > Token-pairs are pretty easy.  A flag in the get_token() routine and
> > remembering the previous token will allow the routine to alternate
> > between returning single tokens and tokens pairs.
> 
> Something like the below maybe?? (no option flag in
> this patch though).
> 
> (ps: I hand edited this patch; It may not apply cleanly, but
> you get the idea...  )

I haven't fully grokked your code, yet.  Having a stack seems like
overkill, but I need to examine it more thoroughly.

My thoughts involved caching the current and previous words and having a
state variable that toggles between GET_NEW_WORD and RETURN_WORD_PAIR. 
I think that would be enough to have get_token() alternate between
returning a new word and returning a word pair.  Changing the tag
(head:, rcvd:, to:, etc) would do an appropriate reset so a word pair
wouldn't be built from (say) the From: and To: lines.  I've thought
about this, but haven't implemented it.




More information about the Bogofilter mailing list