Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Tim Rentsch Newsgroups: comp.lang.c++ Subject: Re: constexpr is really very smart! Date: Thu, 19 Dec 2024 00:57:20 -0800 Organization: A noiseless patient Spider Lines: 94 Message-ID: <86v7vgf3tb.fsf@linuxsc.com> References: <20241216112808.00003f74@yahoo.com> <86jzbyghdw.fsf@linuxsc.com> <20241218013342.0000518a@yahoo.com> <867c7whol9.fsf@linuxsc.com> <20241218220006.00003f8e@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Date: Thu, 19 Dec 2024 09:57:23 +0100 (CET) Injection-Info: dont-email.me; posting-host="c511f684fbfa549a35877bd49c9b418c"; logging-data="2924801"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/UaMGNhteDwC2i7OpD5XTw2GIhQx2vFdQ=" User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux) Cancel-Lock: sha1:PzoIcOsXcpHivpYtg4H1Krjtny4= sha1:qByxNdTE477YDvlB6hbR2ptHXT4= Bytes: 4355 Michael S writes: > On Wed, 18 Dec 2024 09:45:38 -0800 > Tim Rentsch wrote: > >> Michael S writes: >> >>> On Tue, 17 Dec 2024 12:54:19 -0800 >>> Tim Rentsch wrote: >>> >>>> Michael S writes: >> >> [...] >> >>>>> #include >>>>> #include >>>>> #include >>>>> >>>>> static long long fib(long n) >>>>> { >>>>> if (fib <= 0) >>>>> return 0; >>>>> long long f0 = 0, f1 = 1; >>>>> for (long i = 1; i < n; ++i) { >>>>> long long f2 = f0 + f1; >>>>> f0 = f1; >>>>> f1 = f2; >>>>> } >>>>> return f1; >>>>> } >>>> >>>> Here is my second fastest fibonacci calculation code (for >>>> relatively small inputs): >>>> >>>> typedef long unsigned long ULL; >>>> >>>> ULL >>>> fibonacci( unsigned n ){ >>>> ULL b = n&1, a = b^1; >>>> >>>> if( n & 2 ) a += b, b += a; >>>> if( n & 4 ) a += b, b += a, a += b, b += a; >>>> if( n & 8 ){ >>>> ULL na = 13*a+21*b, nb = 21*a+34*b; >>>> a = na, b = nb; >>>> } >>>> >>>> n >>= 4; >>>> while( n-- ){ >>>> ULL na = 610*a + 987*b, nb = 987*a + 1597*b; >>>> a = na, b = nb; >>>> } >>>> >>>> return b; >>>> } >>> >>> From BigO perspective this code looks like it's still O(n) so no >>> better than simple loop. >> >> It is O(n) but it has a smaller constant. >> >>> According to my measurement gear, in range 0 to 92 there are few >>> points where it is faster than simple loop, but in majority of >>> cases it is slower. >> >> I'm at a loss to understand how this could happen. In my own >> measurements, the code shown above runs faster than a simple loop in >> all cases above n > 12, more than twice as fast when n > 17, more >> than three times as fast when n > 42, and going up from there. What >> might account for these radically different results? > > May be, number of repetitions? > I run the test only once. That gives relative advantage to smaller > code which is less sensitive to cold ICache and to cold branch > predictors. That's an interesting idea. Can you also run a measurement where the code is run inside loops? I think it would be instructive to compare the results under the two approaches. (The server I use seems not to have the necessary support to measure very fast code segments that are run only once.) > Also, because I am running it only once, my test is not so good in > seeing differences between various "good" algorithms. > But I feel that running test once is more realistic. I think an argument could be made that repeatedly running the code in a loop gives a more relevant result. Any short code segment that is run only a handful of times will never have a significant impact on overall program performance. It's better to optimize under the assumption that the code will be "hot": if it is hot then the optimization will be worthwhile, and if it isn't hot then the loss will be way down in the noise and not worth worrying about.