Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch <tr.17687@z991.linuxsc.com>
Newsgroups: comp.lang.c++
Subject: Re: constexpr is really very smart!
Date: Thu, 19 Dec 2024 00:57:20 -0800
Organization: A noiseless patient Spider
Lines: 94
Message-ID: <86v7vgf3tb.fsf@linuxsc.com>
References: <vjndub$2glcu$1@paganini.bofh.team> <20241216112808.00003f74@yahoo.com> <86jzbyghdw.fsf@linuxsc.com> <20241218013342.0000518a@yahoo.com> <867c7whol9.fsf@linuxsc.com> <20241218220006.00003f8e@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Thu, 19 Dec 2024 09:57:23 +0100 (CET)
Injection-Info: dont-email.me; posting-host="c511f684fbfa549a35877bd49c9b418c";
	logging-data="2924801"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/UaMGNhteDwC2i7OpD5XTw2GIhQx2vFdQ="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:PzoIcOsXcpHivpYtg4H1Krjtny4=
	sha1:qByxNdTE477YDvlB6hbR2ptHXT4=
Bytes: 4355

Michael S <already5chosen@yahoo.com> writes:

> On Wed, 18 Dec 2024 09:45:38 -0800
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>> Michael S <already5chosen@yahoo.com> writes:
>>
>>> On Tue, 17 Dec 2024 12:54:19 -0800
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>>
>>>> Michael S <already5chosen@yahoo.com> writes:
>>
>> [...]
>>
>>>>> #include <stdio.h>
>>>>> #include <stdlib.h>
>>>>> #include <intrin.h>
>>>>>
>>>>> static long long fib(long n)
>>>>> {
>>>>>   if (fib <= 0)
>>>>>     return 0;
>>>>>   long long f0 = 0, f1 = 1;
>>>>>   for (long i = 1; i < n; ++i) {
>>>>>     long long f2 = f0 + f1;
>>>>>     f0 = f1;
>>>>>     f1 = f2;
>>>>>   }
>>>>>   return f1;
>>>>> }
>>>>
>>>> Here is my second fastest fibonacci calculation code (for
>>>> relatively small inputs):
>>>>
>>>>     typedef long unsigned long ULL;
>>>>
>>>>     ULL
>>>>     fibonacci( unsigned n ){
>>>>         ULL  b = n&1,  a = b^1;
>>>>
>>>>         if(  n & 2  )  a += b,  b += a;
>>>>         if(  n & 4  )  a += b,  b += a,  a += b,  b += a;
>>>>         if(  n & 8  ){
>>>>             ULL na = 13*a+21*b, nb = 21*a+34*b;
>>>>             a = na,  b = nb;
>>>>         }
>>>>
>>>>         n >>= 4;
>>>>         while(  n--  ){
>>>>             ULL  na = 610*a + 987*b,  nb = 987*a + 1597*b;
>>>>             a = na,  b = nb;
>>>>         }
>>>>
>>>>         return  b;
>>>>     }
>>>
>>> From BigO perspective this code looks like it's still O(n) so no
>>> better than simple loop.
>>
>> It is O(n) but it has a smaller constant.
>>
>>> According to my measurement gear, in range 0 to 92 there are few
>>> points where it is faster than simple loop, but in majority of
>>> cases it is slower.
>>
>> I'm at a loss to understand how this could happen.  In my own
>> measurements, the code shown above runs faster than a simple loop in
>> all cases above n > 12, more than twice as fast when n > 17, more
>> than three times as fast when n > 42, and going up from there.  What
>> might account for these radically different results?
>
> May be, number of repetitions?
> I run the test only once.  That gives relative advantage to smaller
> code which is less sensitive to cold ICache and to cold branch
> predictors.

That's an interesting idea.  Can you also run a measurement where
the code is run inside loops?  I think it would be instructive
to compare the results under the two approaches.

(The server I use seems not to have the necessary support to
measure very fast code segments that are run only once.)

> Also, because I am running it only once, my test is not so good in
> seeing differences between various "good" algorithms.
> But I feel that running test once is more realistic.

I think an argument could be made that repeatedly running the code
in a loop gives a more relevant result.  Any short code segment that
is run only a handful of times will never have a significant impact
on overall program performance.  It's better to optimize under the
assumption that the code will be "hot":  if it is hot then the
optimization will be worthwhile, and if it isn't hot then the loss
will be way down in the noise and not worth worrying about.