| Deutsch English Français Italiano |
|
<v7ahnf$2an0d$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Thomas Koenig <tkoenig@netcologne.de>
Newsgroups: comp.arch
Subject: Re: Continuations
Date: Thu, 18 Jul 2024 07:54:23 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <v7ahnf$2an0d$1@dont-email.me>
References: <v6tbki$3g9rg$1@dont-email.me>
<47689j5gbdg2runh3t7oq2thodmfkalno6@4ax.com> <v71vqu$gomv$9@dont-email.me>
<116d9j5651mtjmq4bkjaheuf0pgpu6p0m8@4ax.com>
<f8c6c5b5863ecfc1ad45bb415f0d2b49@www.novabbs.org>
<7u7e9j5dthm94vb2vdsugngjf1cafhu2i4@4ax.com>
<0f7b4deb1761f4c485d1dc3b21eb7cb3@www.novabbs.org>
<v78soj$1tn73$1@dont-email.me>
<4bbc6af7baab612635eef0de4847ba5b@www.novabbs.org>
<v792kn$1v70t$1@dont-email.me>
<ef12aa647464a3ebe3bd208c13a3c40c@www.novabbs.org>
<v79b56$20oq8$1@dont-email.me>
Injection-Date: Thu, 18 Jul 2024 09:54:23 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="dcaf4e807253839291fe21870f3c64fa";
logging-data="2448397"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Urell+vGAYO8rETdyRhhHDDHzjxqyNJw="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:0pW9OtaMmshxhaX3H9/LEdEflfg=
Bytes: 2788
Stephen Fuld <SFuld@alumni.cmu.edu.invalid> schrieb:
[Arrhenius]
> Good, I get that. But Thomas' original discussion of the problem
> indicated that it was very parallel, so the question is, in your
> design, how many of those calculations can go in in parallel?
I ran a little Arrhenius benchmark on an i7-11700. Main program was
program main
implicit none
integer, parameter :: n = 1024
double precision, dimension(n) :: k, a, ea, t
integer :: i
call random_number (a)
call random_number(ea)
ea = 10000+ea*30000
call random_number(t)
t = 400 + 200*t
do i=1,1024*1024
call arrhenius(k,a,ea,t,n)
end do
end program main
and the called routine was (in a separate file, so the compiler
could not notice that the results were actually never used)
subroutine arrhenius(k, a, ea, t, n)
implicit none
integer, intent(in) :: n
double precision, dimension(n), intent(out) :: k
double precision, dimension(n), intent(in) :: a, ea, t
double precision, parameter :: r = 8.314
k = a * exp(-ea/(r*t))
end subroutine arrhenius
Timing result (wall-clock time only):
-O0: 5.343s
-O2: 4.560s
-Ofast: 2.237s
-Ofast -march=native -mtune=native: 2.154
Of course, you kever know what speed your CPU is actually running
at these days, but if I assume 5GHz, that would give around 10
cycles per Arrhenius evaluation, which is quite fast (IMHO).
It uses an AVX2 version of exp, or so I gather from the function
name, _ZGVdN4v_exp_avx2 .