Article <86r0ab3w2b.fsf@linuxsc.com>

Deutsch English Français Italiano
<86r0ab3w2b.fsf@linuxsc.com>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch <tr.17687@z991.linuxsc.com>
Newsgroups: comp.lang.c++
Subject: Re: OT: Re: Sieve of Erastosthenes optimized to the max
Date: Mon, 26 Aug 2024 12:47:56 -0700
Organization: A noiseless patient Spider
Lines: 53
Message-ID: <86r0ab3w2b.fsf@linuxsc.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org> <utq11p$icmm$1@dont-email.me> <v25c87$1ld9m$1@dont-email.me> <86r0duwqgg.fsf@linuxsc.com> <v39o3l$1lvju$1@dont-email.me> <86o78mpnlf.fsf@linuxsc.com> <v3fv2u$2ursd$1@dont-email.me> <86ttibod7n.fsf@linuxsc.com> <86h6eaoi2r.fsf@linuxsc.com> <v4sool$1grge$1@dont-email.me> <867celixcw.fsf@linuxsc.com> <v5sg9s$mat4$1@dont-email.me> <86zfr0b9hw.fsf@linuxsc.com> <v638ud$25623$1@dont-email.me> <861q3o5do1.fsf@linuxsc.com> <v7tdts$29195$1@dont-email.me> <868qx45v5g.fsf@linuxsc.com> <v9lbns$11alj$1@dont-email.me> <86r0aojx1m.fsf@linuxsc.com> <v9o2kf$1gqv1$1@raubtier-asyl.eternal-september.org> <va88m0$ikl3$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Mon, 26 Aug 2024 21:47:59 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="d6baed66fa18aad174400126e3461865";
	logging-data="2731868"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18AkUmkTahQf/RJIsKIqEwMg0gWORBDtzI="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:xv+j2dcOmW7U/aAZuXyJQdaD65o=
	sha1:2r1NJuYgMaJ4X3MB2mP01Q1bB6Y=
Bytes: 3621

Vir Campestris <vir.campestris@invalid.invalid> writes:

> On 16/08/2024 18:35, Bonita Montero wrote:
>
>> But basically I don't think it is a good idea to skip numbers
>> exept multiples of two. [...]
>
> I've been running some experiments.
>
> Skipping evens only is nice and simple on computation;  that's
> good.
>
> Skipping something else requires table lookups.  [...]

It doesn't have to.  One point of the code I've been posting is
that table lookups, including mask productions, are eliminated;
everything gets optimized away so what's left is all compile-time
constants.

> I'm happy with my algorithm:
> - Work out a bitmask for some small primes, only as far as is
> needed to get repeats (the repeat length is the product of the
> primes)
> - Work out bitmasks for the next half a dozen.  Each of those will
> have a repeat the length of the prime.
> - Combine those together into the output bitmap.  That requires
> fewer operations than doing each one into the output bitmap
> - Process the larger primes.  Do the output bitmap in blocks for
> cache friendliness

My experience:

* Special casing primes between 7 and 29 helps (and because it's
  the first step the results can initialize the map);

* Special purpose code where primes are taken in pairs, for the
  set of primes more than 29 and less than 360, with each pair
  being folded into the initialized map, helps (incidentally,
  that is 31 pairs, or 62 primes total);

* The combination of those two steps gives a saving of about 35%;

* Doing striping (the term I use for working in cache-sized
  blocks) seems not to help much, so I don't use striping except
  incidentally in the prime-pairs step;

* Forcing the function for the main second-level loop to be
  inlined saves another 15% or so;

* I haven't tried to measure how the speed-up tricks scale with
  overall number of primes to calculate.  My guess is that they
  are close to linear but I don't have any actual measurements.