Deutsch English Français Italiano |
<86bk1f5mi4.fsf@linuxsc.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Tim Rentsch <tr.17687@z991.linuxsc.com> Newsgroups: comp.lang.c++ Subject: Re: OT: Re: Sieve of Erastosthenes optimized to the max Date: Mon, 26 Aug 2024 08:31:31 -0700 Organization: A noiseless patient Spider Lines: 114 Message-ID: <86bk1f5mi4.fsf@linuxsc.com> References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org> <v25c87$1ld9m$1@dont-email.me> <86r0duwqgg.fsf@linuxsc.com> <v39o3l$1lvju$1@dont-email.me> <86o78mpnlf.fsf@linuxsc.com> <v3fv2u$2ursd$1@dont-email.me> <86ttibod7n.fsf@linuxsc.com> <86h6eaoi2r.fsf@linuxsc.com> <v4sool$1grge$1@dont-email.me> <867celixcw.fsf@linuxsc.com> <v5sg9s$mat4$1@dont-email.me> <86zfr0b9hw.fsf@linuxsc.com> <v638ud$25623$1@dont-email.me> <861q3o5do1.fsf@linuxsc.com> <v7tdts$29195$1@dont-email.me> <868qx45v5g.fsf@linuxsc.com> <v9lbns$11alj$1@dont-email.me> <86r0aojx1m.fsf@linuxsc.com> <v9o2kf$1gqv1$1@raubtier-asyl.eternal-september.org> <va0a8b$30cvv$2@dont-email.me> <va14rc$387ss$1@redfloyd.dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Date: Mon, 26 Aug 2024 17:31:35 +0200 (CEST) Injection-Info: dont-email.me; posting-host="d6baed66fa18aad174400126e3461865"; logging-data="2654084"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/6mycJO2W6/sQHk18Q60Ysu4W2Ntsx4H4=" User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux) Cancel-Lock: sha1:MvrM4uMPiz1APAuxKfaQ6NGqgPY= sha1:4f7xCYGQpB4SQt+MBMweNHkFSV0= Bytes: 4575 red floyd <no.spam.here@its.invalid> writes: > So I'm a little late, but here's my effort to use the modulo 30 > trick. > > Using g++ 12.4.0, Cygwin under Windows 11 22631, Ryzen 5 5600x, > 64GB RAM > > g++ -O3 -std=c++17 > 5761455 primess less than 100 million in 0.182269s > 50847534 primes less than 1 billion in 2.841167s > 455052511 primes less than 10billion in 53.009133s I appreciate both the effort and your posting of results. > [..program..] There are several ways that the program is doing noticeably more work than it needs to. I haven't done any measurements but I believe this extra work accounts for the biggest part of its relative slowness. Here is a short simple program to illustrate some ways that your program could be sped up. (The short simple program is the rest of this posting.) #include <stdio.h> #include <stdlib.h> typedef unsigned long long Index, Count, Bits, NN, N235; static const unsigned char mods[8] = { 1, 7, 11, 13, 17, 19, 23, 29, }; static unsigned char bytes[ 1ULL * 50 * 1000 * 1000 * 1000 ]; static void mod30_sieve( NN ); static NN product_NN( N235, N235 ); static _Bool is_marked_composite( N235 ); static void mark_composite( NN ); static Count unmarked_less_than( NN ); int main( int argc, char *argv[] ){ NN ceiling = argc > 1 ? strtoull( argv[1], 0, 10 ) : 1000000000; setbuf( stdout, 0 ); if( ceiling >= 30 * sizeof bytes ){ printf( " urk.. limit must be less than %zu\n", 30 * sizeof bytes ); exit( 0 ); } printf( " determining primes less than %llu ...\n", ceiling ); mod30_sieve( ceiling ); printf( " ... found %llu primes.\n", unmarked_less_than( ceiling ) ); return 0; } void mod30_sieve( NN ceiling ){ N235 i, j; NN ijNN; for( bytes[0] = 1, i = 0; product_NN( i, i ) < ceiling; i++ ){ if( is_marked_composite( i ) ) continue; for( j = i; ijNN = product_NN( i, j ), ijNN < ceiling; j++ ){ mark_composite( ijNN ); } } } NN product_NN( N235 x, N235 y ){ NN xnn = (x>>3)*30 + mods[x&7]; NN ynn = (y>>3)*30 + mods[y&7]; return xnn * ynn; } _Bool is_marked_composite( N235 t ){ return bytes[t>>3] & 1<<(t&7); } void mark_composite( NN nn ){ static const unsigned char masks[] = { [1]= 1, [7]= 2, [11]= 4, [13]= 8, [17]= 16, [19]= 32, [23]=64, [29]=128, }; bytes[ nn/30 ] |= masks[ nn%30 ]; } Count unmarked_less_than( NN ceiling ){ Index i = ceiling/30; NN extra = ceiling - i*30; Count r = (ceiling > 2) + (ceiling > 3) + (ceiling > 5); for( Index j = 0; j < 8 && mods[j] < extra; j++ ){ r += bytes[i]>>j &1 ^1; } while( i && i-- ){ for( Bits b = bytes[i] ^0xFF; b; b >>= 1 ) r += b&1; } return r; }