switching between different databases - in 1.3.0.rc1

Matthias Andree matthias.andree at gmx.de
Sat Jun 7 16:50:54 CEST 2025


Am 29.05.25 um 20:22 schrieb Rob McEwen via bogofilter:
> From "Matthias Andree" <matthias.andree at gmx.de>
>> Should you decide to do anything of profiling/performance metrics and 
>> you identify hot spots or I/O slowdowns somewhere, please share your 
>> findings.
>
> Matthias,
>
> So as I was starting to do some performance testing, I noticed a few 
> interesting things - and hopefully if you'll deem some of my resulting 
> suggestions worthy enough to be acted upon? ...perhaps making it into 
> RC2?
>
> One of the things that I've noticed is that RC1 extracts MUCH MORE 
> additional types of tokens than did 1.2.5 - especially with certain 
> types of emails - and that is overall EXCELLENT - but it does come 
> with caveats/concerns. So I think this can potentially cause 
> performance issues? But I'm STILL very glad for this additional and 
> helpful data - so please don't remove or reverse this additional data 
> - I just think think there are some helpful workarounds that might in 
> SOME cases help mitigate that if/when such issues occur (or this also 
> might lead to some performance optimizations too?)
>
Rob,

could you provide me a sample message (feel free to mail me directly and 
possibly as compressed attachment if you deem it sensitive) where you 
see that 1.3.0.rc1 creates massively more tokens than 1.2.5 did?


I don't see a massive change in tokens (maybe 1%) or run-time (maybe 15% 
less time needed), but it seems that one of the fixes I made may have 
undesirable side effects, namely, 
https://gitlab.com/bogofilter/bogofilter/-/commit/71bf5854db4379f48adc0920ae93fced8af87f17 
- that will want fixing because it misinterprets body tokens as headers.

Regards,
Matthias



More information about the bogofilter mailing list