"#define" vs "const int" for algorithm

Matthias Andree matthias.andree at gmx.de
Tue Nov 12 01:18:21 CET 2002


On Mon, 11 Nov 2002, David Relson wrote:

> Using "#define algorithm" allows the compiler's optimizer to exclude 
> unnecessary code.  The attached asm.tgz has files gra.c and rob.c which use 
> #define and "const int" declarations of algorithm.  Script 
> show.source.and.assembler will compile them and generate combined 
> source/assembler listings.  Look in the listings for "printf" related 
> code.  gra.c.txt only has code for one call to printf.  rob.c.txt has two 
> calls.

Do you recall your exact compiler command line?

Do you have a brown paper bag handy?  Did you compile with -O or -Os?
I'm using gcc 3.2 here, 6 bytes difference with -Os, 52 without -O... 2
are from the text length, another 4 store the enum code.

$ cat show
OPT="-Os -g -march=i486"
for i in rob gra ; do
  gcc $OPT -o $i $i.c
done
size rob gra
$ sh -x show
+ OPT=-Os -g -march=i486
+ gcc -Os -g -march=i486 -o rob rob.c
+ gcc -Os -g -march=i486 -o gra gra.c
+ size rob gra
   text    data     bss     dec     hex filename
    822     260      28    1110     456 rob
    816     260      28    1104     450 gra

Without -Os:

+ OPT=-g -march=i486
+ gcc -g -march=i486 -o rob rob.c
+ gcc -g -march=i486 -o gra gra.c
+ size rob gra
   text    data     bss     dec     hex filename
    903     260      28    1191     4a7 rob
    851     260      28    1139     473 gra

It becomes more striking to view this code when linked against dietlibc:

     1	
     2	rob:     file format elf32-i386
     3	
     4	Disassembly of section .text:
     5	
...
    20	08048090 <main>:
    21	 8048090:	55                   	push   %ebp
    22	 8048091:	89 e5                	mov    %esp,%ebp
    23	 8048093:	c7 45 08 48 81 04 08 	movl   $0x8048148,0x8(%ebp)
    24	 804809a:	5d                   	pop    %ebp
    25	 804809b:	e9 5a 00 00 00       	jmp    80480fa <puts>
    26	
    27	080480a0 <_exit>:
    28	 80480a0:	b0 01                	mov    $0x1,%al
...

(BTW, the "tail recursion elimination" global optimizer is also working
nicely as you can see here.)

Oh, wait a minute. Gcc 3.2 saw printf has an argument without format
options and optimized it to puts. Gcc 2.95.3, dietlibc, statically
linked again:

    24	080480a0 <main>:
    25	 80480a0:	55                   	push   %ebp
    26	 80480a1:	89 e5                	mov    %esp,%ebp
    27	 80480a3:	83 ec 08             	sub    $0x8,%esp
    28	 80480a6:	83 c4 f4             	add    $0xfffffff4,%esp
    29	 80480a9:	68 32 93 04 08       	push   $0x8049332
    30	 80480ae:	e8 49 00 00 00       	call   80480fc <printf>
    31	 80480b3:	89 ec                	mov    %ebp,%esp
    32	 80480b5:	5d                   	pop    %ebp
    33	 80480b6:	c3                   	ret    
    34	 80480b7:	89 f6                	mov    %esi,%esi
    35	 80480b9:	8d bc 27 00 00 00 00 	lea    0x0(%edi,1),%edi
    36	
    37	080480c0 <_exit>:
    38	 80480c0:	b0 01                	mov    $0x1,%al

> A single enum could be used.  Ideally with both AL_GRAHAM and AL_ROBINSON 
> defined and with a good optimizer, it isn't necessary to surround the "if 
> (GRAHAM) {}" conditional with #ifdef/#endif

We have good optimizers in gcc 2.95 and gcc 3.2. The relevant code is
not in a tight loop, so optimizing even 1k away is not going to buy us
anything. Remember:

   text    data     bss     dec     hex filename
 104684      32      80  104796   1995c lexer.o

Let's not fall prey to the premature optimization that Donald E. Knuth
warns us of.

> >BTW, can we hope to abstract the whole interface to a common API?
> 
> What interface/API are you thinking of?

Define a common set of registering/unregistering/evaluation functions
(similar to what you'd call a pure virtual class in C++) and then define
each function once for Graham and once for Robinson. Then compile Graham
and Robinson code from separate source files.

The next step would be to document registration/query/"spamicity
calculating" interfaces and expose that code in a library (along with
the lexer.)

-- 
Matthias Andree



More information about the bogofilter-dev mailing list