Unregistering Mail

Nick Simicich njs at scifi.squawk.com
Fri Feb 7 11:14:43 CET 2003


At 03:47 PM 2003-02-06 -0500, David Relson wrote:
>Do people want the ability to unregister mail?  If so what would be the 
>preferred way to do it?  One of the above suggestions or something different?

I do not think that unregistering is that important, although there is a 
circumstance that I would have used it in.  However, at this point, (since 
I am using -u) I have many thousands of messages in the corpus. I see that 
you have begun dating the entries so that old entries can be expired.  I am 
"reorganizing my database" - at this point, I decided to do it by renaming 
the old corpus and doing a

bogoutil -d goodlist.old.db | bogoutil -l goodlist.db

I tried just dumping it, but I ran out of space at about 100 meg. The db 
file fits in under 15 meg.  The load is still running, and it has cranked 
for over 95 minutes of CPU at this point.  I believe it is running, because 
if it was blocking, in a loop, the dump would not be running, and it has 
consumed about 23 minutes of CPU time itself - CPU is split 75% load - 20% 
dump. (that does not go to 100%, the CPU is idle about .3%, the rest is 
other stuff).

This seems like a lot, I will wait for it for a while longer.  I am not 
running any e-mail on the system.  I wonder if this is just the way it is, 
or if there is any other way?

However, I would rather not see any change to the existing options, because 
it is a bad idea, unless your options are broken. If it is that important, 
add -U to be used in combination with -N or -S.  Normally, -N and -S are 
used to register mail.  I think that the cleanest syntax is to make -U in 
conjunction with -N or -S do an unregister.

I read that someone thought that it was "compellingly simple" to change -N 
and -S to unregister only.

I think that the complexity of -Sn (having multiple options to do a single 
action) outweighs the simplicity.

There are two types of options we have:  "Action" options, and "Modifier" 
options.

Actions are like:  -h -n -s -N -S, the absence of options (which implies 
look at the message, there should be an option for that, and it should be 
the default) and -u.  Modifiers are like -2, -3, -d, -l, or -o, which 
change the way an action operates.

Many commands have this sort of setup.  What we are doing, in some ways, is 
to make an option both an action type, and, when used in combination with 
another option, a modifier.

I think that the command is cleaner if an option is one or the other.  Part 
of the issue is that if you allow

My choices would be, in order (Think Australian Preference Ballot):

1. -N -U or -S -U.  -U is a modifier that is only valid to modify the 
action of -N or -S, and it says (After removing the message's tokens from 
the database it is in, do not register those words in what would the target 
base.  -U is not an action - it has no meaning by itself.  This has the 
advantage of not adding another action to the list of possible actions in 
the explanation.

2.  New options (-y for unregister from spam, -z for unregister from ham, 
for example - pick the letters you like).  That way, a single option always 
means the same thing every time.  We are adding two new "action options", 
but you can still only have one action option per command invocation.

2.  Do not add the function.  We have gone this long without it,

3.  -Ns - Allowing two actions - and changing the actions to mix/match.  I 
believe that this is confusing, and I am trying to remember another command 
that does similar processing that allows combination of major actions like 
this.

4.  -UNs.  -U becomes the action, and -Ns are then the modifiers.  I might 
actually put this ahead of three if we had no history.  If you feel that 
the history is unimportant, then you should rerank this as 3 for me.

5.  A configuration file option that allows options to mean different 
things at different sites.  Imagine answering questions here if we did not 
know what the options meant.  Imagine answering questions 
anywhere!  Someone says, "I entered bogofilter -X" and that is exactly it, 
because you do not know what it means -h will have to return  config options.

--
SPAM: Trademark for spiced, chopped ham manufactured by Hormel.
spam: Unsolicited, Bulk E-mail, where e-mail can be interpreted generally 
to mean electronic messages designed to be read by an individual, and it 
can include Usenet, SMS, AIM, etc.  But if it is not all three of 
Unsolicited, Bulk, and E-mail, it simply is not spam. Misusing the term 
plays into the hands of the spammers, since it causes confusion, and 
spammers thrive on  confusion. Spam is not speech, it is an action, like 
theft, or vandalism. If you were not confused, would you patronize a spammer?
Nick Simicich - njs at scifi.squawk.com - http://scifi.squawk.com/njs.html
Stop by and light up the world!



More information about the Bogofilter mailing list