Deutsch English Français Italiano |
<20241219234549.00001902@yahoo.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Michael S <already5chosen@yahoo.com> Newsgroups: comp.lang.c++ Subject: Re: constexpr is really very smart! Date: Thu, 19 Dec 2024 23:45:49 +0200 Organization: A noiseless patient Spider Lines: 228 Message-ID: <20241219234549.00001902@yahoo.com> References: <vjndub$2glcu$1@paganini.bofh.team> <20241216112808.00003f74@yahoo.com> <86jzbyghdw.fsf@linuxsc.com> <20241218013342.0000518a@yahoo.com> <867c7whol9.fsf@linuxsc.com> <20241218220006.00003f8e@yahoo.com> <86v7vgf3tb.fsf@linuxsc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Thu, 19 Dec 2024 22:45:53 +0100 (CET) Injection-Info: dont-email.me; posting-host="a37a7afdc6a200c65b6cb4dd23255f26"; logging-data="3163558"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/mIJ4D+K52mox0lKlIPHEghqvFZlfZhBo=" Cancel-Lock: sha1:XZkGuNERLE0lYSCK/6YPgFvRFZQ= X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32) Bytes: 7563 On Thu, 19 Dec 2024 00:57:20 -0800 Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: > Michael S <already5chosen@yahoo.com> writes: > > > On Wed, 18 Dec 2024 09:45:38 -0800 > > Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: > > > >> Michael S <already5chosen@yahoo.com> writes: > >> > >>> On Tue, 17 Dec 2024 12:54:19 -0800 > >>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: > >>> > >>>> Michael S <already5chosen@yahoo.com> writes: > >> > >> [...] > >> > >>>>> #include <stdio.h> > >>>>> #include <stdlib.h> > >>>>> #include <intrin.h> > >>>>> > >>>>> static long long fib(long n) > >>>>> { > >>>>> if (fib <= 0) > >>>>> return 0; > >>>>> long long f0 = 0, f1 = 1; > >>>>> for (long i = 1; i < n; ++i) { > >>>>> long long f2 = f0 + f1; > >>>>> f0 = f1; > >>>>> f1 = f2; > >>>>> } > >>>>> return f1; > >>>>> } > >>>> > >>>> Here is my second fastest fibonacci calculation code (for > >>>> relatively small inputs): > >>>> > >>>> typedef long unsigned long ULL; > >>>> > >>>> ULL > >>>> fibonacci( unsigned n ){ > >>>> ULL b = n&1, a = b^1; > >>>> > >>>> if( n & 2 ) a += b, b += a; > >>>> if( n & 4 ) a += b, b += a, a += b, b += a; > >>>> if( n & 8 ){ > >>>> ULL na = 13*a+21*b, nb = 21*a+34*b; > >>>> a = na, b = nb; > >>>> } > >>>> > >>>> n >>= 4; > >>>> while( n-- ){ > >>>> ULL na = 610*a + 987*b, nb = 987*a + 1597*b; > >>>> a = na, b = nb; > >>>> } > >>>> > >>>> return b; > >>>> } > >>> > >>> From BigO perspective this code looks like it's still O(n) so no > >>> better than simple loop. > >> > >> It is O(n) but it has a smaller constant. > >> > >>> According to my measurement gear, in range 0 to 92 there are few > >>> points where it is faster than simple loop, but in majority of > >>> cases it is slower. > >> > >> I'm at a loss to understand how this could happen. In my own > >> measurements, the code shown above runs faster than a simple loop > >> in all cases above n > 12, more than twice as fast when n > 17, > >> more than three times as fast when n > 42, and going up from > >> there. What might account for these radically different results? > > > > May be, number of repetitions? > > I run the test only once. That gives relative advantage to smaller > > code which is less sensitive to cold ICache and to cold branch > > predictors. > > That's an interesting idea. Can you also run a measurement where > the code is run inside loops? I think it would be instructive > to compare the results under the two approaches. > > (The server I use seems not to have the necessary support to > measure very fast code segments that are run only once.) > > > Also, because I am running it only once, my test is not so good in > > seeing differences between various "good" algorithms. > > But I feel that running test once is more realistic. > > I think an argument could be made that repeatedly running the code > in a loop gives a more relevant result. Any short code segment that > is run only a handful of times will never have a significant impact > on overall program performance. It's better to optimize under the > assumption that the code will be "hot": if it is hot then the > optimization will be worthwhile, and if it isn't hot then the loss > will be way down in the noise and not worth worrying about. I feel that running fib(n) with the same n in loop too unrealistic. So I decided to run, at least, with different values of n. Ended up spending about a hour just to build a test bench. The answer is - in a loop of more than dozen iterations your code is indeed faster. Esp. so for hundred or more iterations. Here is my test bench. #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <intrin.h> long long fib(long n); volatile long long dummy; int main(int argz, char** argv) { if (argz < 2) { printf("Go away!\n"); return 1; } char* endp; long n = strtol(argv[1], &endp, 0); if (endp == argv[1]) { printf("%s is not a number. Go away!\n", argv[1]); return 1; } if (argz == 2) { unsigned long long t0 = _rdtsc(); long long f = fib(n); unsigned long long t1 = _rdtsc(); printf("fib(%ld)=%lld. %lld clock cycles.\n", n, f, t1 - t0); return 0; } // argz > 2 if (argv[2][0]=='t') { long long f0 = 0, f1 = 1; for (long i = 0; i <= n; ++i) { long long res = fib(i); if (res != f0) { printf("Mistake at n = %ld. %lld instead of %lld\n" , i, res, f0); return 1; } long long f2 = f0 + f1; f0 = f1; f1 = f2; } printf("o.k.\n"); return 0; } long m = strtol(argv[2], &endp, 0); if (endp == argv[2]) { printf("%s is not a number. Go away!\n", argv[2]); return 1; } if (n > m) { long tmp = n; n = m; m = tmp; } unsigned long dn = m - n + 1; if (dn > 1000000) { printf( "Range [%ld..%ld] is wider than 1M. Unsupported. I am sorry.\n" , n, m); return 1; } size_t n_iter = 10; ========== REMAINDER OF ARTICLE TRUNCATED ==========