Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Michael S Newsgroups: comp.lang.c Subject: Re: filling area by color atack safety - worst memory size Date: Wed, 17 Apr 2024 00:47:22 +0300 Organization: A noiseless patient Spider Lines: 1149 Message-ID: <20240417004609.000010aa@yahoo.com> References: <86h6h3nvyz.fsf@linuxsc.com> <865xxiok09.fsf@linuxsc.com> <20240319131842.00002138@yahoo.com> <86o7b9ms7d.fsf@linuxsc.com> <20240320115416.00001ab5@yahoo.com> <86zfusltwp.fsf@linuxsc.com> <20240324193352.000062e9@yahoo.com> <86jzlrk0f6.fsf@linuxsc.com> <20240405173033.00006145@yahoo.com> <868r1k1uq8.fsf@linuxsc.com> <20240411152033.00007173@yahoo.com> <86bk6ez9te.fsf@linuxsc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Tue, 16 Apr 2024 23:47:29 +0200 (CEST) Injection-Info: dont-email.me; posting-host="b6e7ddab6ceaee868731fda70ad0496e"; logging-data="1235687"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+fR8cCNJyR1hw75DkpGwbgWpChdR4vrSI=" Cancel-Lock: sha1:E/oEajEJ5CT4fADpVDd+6O7JCPk= X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32) Bytes: 67647 On Fri, 12 Apr 2024 11:59:25 -0700 Tim Rentsch wrote: > Michael S writes: > > > On Wed, 10 Apr 2024 19:47:11 -0700 > > Tim Rentsch wrote: > > > >> Stack-based methods tend to do well on long skinny patterns and > >> tend to do not as well on fatter patterns such as circles or > >> squares. The fractal pattern is ideal for a stack-based method. > >> Conversely, patterns that are mostly solid shapes don't fare as > >> well under stack-based methods, at least not the ones that have > >> been posted in this thread, and also they tend to use more memory > >> in those cases. > > > > Indeed, with solid shapes it uses more memory. But at least in my > > tests on my hardware with this sort of shapes it is easily faster > > than anything else. The difference vs the best of the rest is > > especially big at 4K images on AMD Zen3 based hardware, but even > > on Intel Skylake which generally serves as equalizer between > > different algorithms, the speed advantage of 2x2 stack is > > significant. > > I'm curious to know how your 2x2 algorithm compares to my > second (longer) stack-based algorithm when run on the Zen3. > On my test hardware they are roughly comparable, depending > on size and pattern. My curiosity includes the fatter > patterns as well as the long skinny ones. Finally found the time for speed measurements. I tested four algorithms: 1. stack_2x2 - stack-like processing where each element is 2x2 rectangle with Lot's wife amendment. 2. stack_timr1 - first variant of stack by Tim Rentsch 3. stack_timr2 - second variant of stack by Tim Rentsch 4. queue_timr - "take no prisoners" queue by Tim Rentsch, the one with power-of-two circular buffer, (x,y) packed to 32 bits and inner loop optimized for solid shapes. Tests were run on four CPUs 1. IVB - Intel Core i7-3570 at 3700 MHz. As far as CPUs are going, rather old thing. 2. HSW - Intel Xeon E3-1271 v3 at 4000 MHz. Only couple of years younger than above. 3. SKC - Intel Xeon E-2176G at 4250 MHz. Significantly younger, but microarchitecture exists since 2015. 4. ZN3 - AMD EPYC 7543P at 3700 MHz. The only one on my roaster whose microarchitecture can be considered relatively modern. As you can see, with exception of the oldest CPU, your 2d stack variant is not an improvement over the first. What surprised me after I put all results together, was a poor showing of SKC. I can't remember any other of my microbenchmarks (and I do plenty) were this CPU was so decisively beaten by its older cousin. The columns are as following: 1. Shape name 2. Starting point (x,y) 3. Number of points to recolor 4. total test duration, seconds 5. time per pixel - normalized to image area, nsec 6. time per pixel - normalized to number of points to recolor, nsec IVB,stack_2x2: [25 x 19] * 421054 Solid square ( 12, 9) 475 0.547 2.73 2.73 Solid square ( 0, 0) 475 0.537 2.68 2.68 standing snake-like shape ( 0, 0) 259 0.522 2.61 4.79 prostrate snake-like shape ( 0, 0) 259 0.528 2.64 4.84 slalom shape ( 0, 0) 233 0.459 2.29 4.68 slalom shape(rotated) ( 0, 0) 223 0.455 2.27 4.85 cross-in-cross ( 0, 0) 403 0.515 2.57 3.04 fractal tree ( 12, 0) 247 0.469 2.34 4.51 greed(2) ( 0, 0) 367 0.558 2.79 3.61 greed(3) ( 0, 0) 283 0.463 2.31 3.89 greed(4) ( 0, 0) 223 0.399 1.99 4.25 donut ( 23, 9) 238 0.305 1.52 3.04 [200 x 200] * 5002 Solid square ( 100, 100) 40000 0.461 2.30 2.30 Solid square ( 0, 0) 40000 0.460 2.30 2.30 standing snake-like shape ( 0, 0) 20100 0.382 1.91 3.80 prostrate snake-like shape ( 0, 0) 20100 0.474 2.37 4.71 slalom shape ( 0, 0) 19802 0.435 2.17 4.39 slalom shape(rotated) ( 0, 0) 19802 0.450 2.25 4.54 cross-in-cross ( 0, 0) 39216 0.470 2.35 2.40 fractal tree ( 99, 0) 18674 0.432 2.16 4.62 greed(2) ( 0, 0) 30000 0.458 2.29 3.05 greed(3) ( 0, 0) 22311 0.413 2.06 3.70 greed(4) ( 0, 0) 17500 0.348 1.74 3.98 donut ( 199, 100) 25830 0.315 1.57 2.44 [1280 x 720] * 218 Solid square ( 640, 360) 921600 0.450 2.24 2.24 Solid square ( 0, 0) 921600 0.450 2.24 2.24 standing snake-like shape ( 0, 0) 461160 0.371 1.85 3.69 prostrate snake-like shape ( 0, 0) 461440 0.469 2.33 4.66 slalom shape ( 0, 0) 460082 0.437 2.18 4.36 slalom shape(rotated) ( 0, 0) 460800 0.448 2.23 4.46 cross-in-cross ( 0, 0) 917616 0.452 2.25 2.26 fractal tree ( 639, 0) 445860 0.460 2.29 4.73 greed(2) ( 0, 0) 691200 0.468 2.33 3.11 greed(3) ( 0, 0) 512160 0.406 2.02 3.64 greed(4) ( 0, 0) 403200 0.344 1.71 3.91 donut (1279, 360) 655856 0.326 1.62 2.28 [1920 x 1080] * 98 Solid square ( 960, 540) 2073600 0.453 2.23 2.23 Solid square ( 0, 0) 2073600 0.457 2.25 2.25 standing snake-like shape ( 0, 0) 1037340 0.374 1.84 3.68 prostrate snake-like shape ( 0, 0) 1037760 0.474 2.33 4.66 slalom shape ( 0, 0) 1036800 0.443 2.18 4.36 slalom shape(rotated) ( 0, 0) 1036800 0.452 2.22 4.45 cross-in-cross ( 0, 0) 2067616 0.457 2.25 2.26 fractal tree ( 959, 0) 1034612 0.453 2.23 4.47 greed(2) ( 0, 0) 1555200 0.450 2.21 2.95 greed(3) ( 0, 0) 1152000 0.407 2.00 3.61 greed(4) ( 0, 0) 907200 0.346 1.70 3.89 donut (1919, 540) 1477788 0.326 1.60 2.25 [3840 x 2160] * 26 Solid square (1920,1080) 8294400 0.500 2.32 2.32 Solid square ( 0, 0) 8294400 0.539 2.50 2.50 standing snake-like shape ( 0, 0) 4148280 0.449 2.08 4.16 prostrate snake-like shape ( 0, 0) 4149120 0.746 3.46 6.92 slalom shape ( 0, 0) 4147200 0.703 3.26 6.52 slalom shape(rotated) ( 0, 0) 4147200 0.537 2.49 4.98 cross-in-cross ( 0, 0) 8282416 0.518 2.40 2.41 fractal tree (1919, 0) 4135652 0.514 2.38 4.78 greed(2) ( 0, 0) 6220800 0.533 2.47 3.30 greed(3) ( 0, 0) 4608000 0.468 2.17 3.91 greed(4) ( 0, 0) 3628800 0.386 1.79 4.09 donut (3839,1080) 5919706 0.356 1.65 2.31 IVB,stack_timr1: [25 x 19] * 421054 Solid square ( 12, 9) 475 1.132 5.66 5.66 Solid square ( 0, 0) 475 1.171 5.85 5.85 standing snake-like shape ( 0, 0) 259 0.724 3.62 6.64 prostrate snake-like shape ( 0, 0) 259 0.712 3.56 6.53 slalom shape ( 0, 0) 233 0.632 3.16 6.44 slalom shape(rotated) ( 0, 0) 223 0.632 3.16 6.73 cross-in-cross ( 0, 0) 403 0.931 4.65 5.49 fractal tree ( 12, 0) 247 0.537 2.68 5.16 greed(2) ( 0, 0) 367 0.866 4.33 5.60 greed(3) ( 0, 0) 283 0.724 3.62 6.08 greed(4) ( 0, 0) 223 0.618 3.09 6.58 donut ( 23, 9) 238 0.632 3.16 6.31 [200 x 200] * 5002 Solid square ( 100, 100) 40000 0.764 3.82 3.82 Solid square ( 0, 0) 40000 0.759 3.79 3.79 standing snake-like shape ( 0, 0) 20100 0.389 1.94 3.87 prostrate snake-like shape ( 0, 0) 20100 0.400 2.00 3.98 slalom shape ( 0, 0) 19802 0.388 1.94 3.92 slalom shape(rotated) ( 0, 0) 19802 0.388 1.94 3.92 cross-in-cross ( 0, 0) 39216 0.763 3.81 3.89 fractal tree ( 99, 0) 18674 0.372 1.86 3.98 greed(2) ( 0, 0) 30000 0.591 2.95 3.94 greed(3) ( 0, 0) 22311 0.445 2.22 3.99 greed(4) ( 0, 0) 17500 0.345 1.72 3.94 donut ( 199, 100) 25830 0.517 2.58 4.00 [1280 x 720] * 218 Solid square ( 640, 360) 921600 0.793 3.95 3.95 Solid square ( 0, 0) 921600 0.868 4.32 4.32 standing snake-like shape ( 0, 0) 461160 0.420 2.09 4.18 prostrate snake-like shape ( 0, 0) 461440 0.441 2.20 4.38 slalom shape ( 0, 0) 460082 0.434 2.16 4.33 slalom shape(rotated) ( 0, 0) 460800 0.429 2.14 4.27 cross-in-cross ( 0, 0) 917616 0.801 3.99 4.00 ========== REMAINDER OF ARTICLE TRUNCATED ==========