patch for DB byte order problem
twitham at quiknet.com
Sat Nov 16 23:05:22 EST 2002
I just got on your list to check out the conversation about byte order
of the database. Looks like you already have it under control. But I
had a couple comments just to clarify my need for this. You said:
Gyepi> This is only a problem if you transfer a database, in binary
Gyepi> format, from one endian platform to another.
Gyepi> I had thought of this, but assumed that people would be using
Gyepi> bogoutil for such transfers.
Actually, the reason I needed it for is quite different. I alluded to
it in the submission:
If you use bogofilter from both a small endian machine
(perhaps where you read mail) and a big endian one (perhaps
your mail server), it will not function properly.
At work my mail spool and home directory are on NFS. The mail server
is currently a Solaris machine, delivering mail to many users on the
huge unix network. So Sun hardware delivers my mail, invoking
procmail and bogofilter along the way. Big endian.
However, I read my mail on an Intel Linux machine. This is where I
teach or correct bogofilter by manually piping messages through it.
It's not practical to convert the database before correcting or
teaching. It's also not convenient to remember to log into a Solaris
machine for this. So, I need my database understood by both all the
time so that it can be used from any type of machine on the network.
Gyepi> Nonetheless, the data should be stored in a platform and
Gyepi> database version independent manner and the conversion should
Gyepi> be made whenever we set/get values so one does not *have* to
Gyepi> use bogoutil for transfers.
Exactly! I can build bogofilter such that each platform links in the
same version of DB, in case some versions are not compatible. But
bogofilter must handle the format of the data values.
Gyepi> I agree with the patch conceptually, but not the implementation
Gyepi> so I'll take responsibility for it and provide an update.
Gyepi> I vote for storing the data in little-endian format.
For me, big endian happens to be more efficient on my mail server
today. But I don't really care which way it goes; I chose that
because it is standard network format. Also, I looked at the DB
source and it seems to use big endian for the DB keys, so that would
make it consistent.
But in the future, all mail servers will be on x86 hardware so
little-endian it better! :-)
Thanks for all your excellent work on bogofilter.
-- Tim Witham <twitham at quiknet.com>, http://www.quiknet.com/~twitham/
More information about the Bogofilter-dev