Deutsch English Français Italiano |
<v7ahnf$2an0d$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Thomas Koenig <tkoenig@netcologne.de> Newsgroups: comp.arch Subject: Re: Continuations Date: Thu, 18 Jul 2024 07:54:23 -0000 (UTC) Organization: A noiseless patient Spider Lines: 49 Message-ID: <v7ahnf$2an0d$1@dont-email.me> References: <v6tbki$3g9rg$1@dont-email.me> <47689j5gbdg2runh3t7oq2thodmfkalno6@4ax.com> <v71vqu$gomv$9@dont-email.me> <116d9j5651mtjmq4bkjaheuf0pgpu6p0m8@4ax.com> <f8c6c5b5863ecfc1ad45bb415f0d2b49@www.novabbs.org> <7u7e9j5dthm94vb2vdsugngjf1cafhu2i4@4ax.com> <0f7b4deb1761f4c485d1dc3b21eb7cb3@www.novabbs.org> <v78soj$1tn73$1@dont-email.me> <4bbc6af7baab612635eef0de4847ba5b@www.novabbs.org> <v792kn$1v70t$1@dont-email.me> <ef12aa647464a3ebe3bd208c13a3c40c@www.novabbs.org> <v79b56$20oq8$1@dont-email.me> Injection-Date: Thu, 18 Jul 2024 09:54:23 +0200 (CEST) Injection-Info: dont-email.me; posting-host="dcaf4e807253839291fe21870f3c64fa"; logging-data="2448397"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Urell+vGAYO8rETdyRhhHDDHzjxqyNJw=" User-Agent: slrn/1.0.3 (Linux) Cancel-Lock: sha1:0pW9OtaMmshxhaX3H9/LEdEflfg= Bytes: 2788 Stephen Fuld <SFuld@alumni.cmu.edu.invalid> schrieb: [Arrhenius] > Good, I get that. But Thomas' original discussion of the problem > indicated that it was very parallel, so the question is, in your > design, how many of those calculations can go in in parallel? I ran a little Arrhenius benchmark on an i7-11700. Main program was program main implicit none integer, parameter :: n = 1024 double precision, dimension(n) :: k, a, ea, t integer :: i call random_number (a) call random_number(ea) ea = 10000+ea*30000 call random_number(t) t = 400 + 200*t do i=1,1024*1024 call arrhenius(k,a,ea,t,n) end do end program main and the called routine was (in a separate file, so the compiler could not notice that the results were actually never used) subroutine arrhenius(k, a, ea, t, n) implicit none integer, intent(in) :: n double precision, dimension(n), intent(out) :: k double precision, dimension(n), intent(in) :: a, ea, t double precision, parameter :: r = 8.314 k = a * exp(-ea/(r*t)) end subroutine arrhenius Timing result (wall-clock time only): -O0: 5.343s -O2: 4.560s -Ofast: 2.237s -Ofast -march=native -mtune=native: 2.154 Of course, you kever know what speed your CPU is actually running at these days, but if I assume 5GHz, that would give around 10 cycles per Arrhenius evaluation, which is quite fast (IMHO). It uses an AVX2 version of exp, or so I gather from the function name, _ZGVdN4v_exp_avx2 .