| Deutsch English Français Italiano |
|
<20240723162621.00005d95@yahoo.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Michael S <already5chosen@yahoo.com> Newsgroups: comp.arch Subject: Re: Continuations Date: Tue, 23 Jul 2024 16:26:21 +0300 Organization: A noiseless patient Spider Lines: 61 Message-ID: <20240723162621.00005d95@yahoo.com> References: <v6tbki$3g9rg$1@dont-email.me> <47689j5gbdg2runh3t7oq2thodmfkalno6@4ax.com> <v71vqu$gomv$9@dont-email.me> <116d9j5651mtjmq4bkjaheuf0pgpu6p0m8@4ax.com> <f8c6c5b5863ecfc1ad45bb415f0d2b49@www.novabbs.org> <7u7e9j5dthm94vb2vdsugngjf1cafhu2i4@4ax.com> <0f7b4deb1761f4c485d1dc3b21eb7cb3@www.novabbs.org> <v78soj$1tn73$1@dont-email.me> <v7dsf2$3139m$1@dont-email.me> <277c774f1eb48be79cd148dfc25c4367@www.novabbs.org> <v7ecrj$33vqv$1@dont-email.me> <20240722140115.000058cf@yahoo.com> <v7o828$16uk6$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Tue, 23 Jul 2024 15:25:52 +0200 (CEST) Injection-Info: dont-email.me; posting-host="13ef830bf2b3f0c9f0d6691a9de09cf2"; logging-data="1250607"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1928Fo5hvk9eWh89hmCb8v1qex2RDtYU+4=" Cancel-Lock: sha1:6cYcG4aUvu7eJXY7LhLPJa/W92M= X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32) Bytes: 3790 On Tue, 23 Jul 2024 14:35:20 +0200 Terje Mathisen <terje.mathisen@tmsw.no> wrote: > Michael S wrote: > > On Fri, 19 Jul 2024 20:55:47 +0200 > > Terje Mathisen <terje.mathisen@tmsw.no> wrote: > > > >> MitchAlsup1 wrote: > >>> On Fri, 19 Jul 2024 14:16:01 +0000, Terje Mathisen wrote: > >>>> Back when I first looked at Invsqrt(), I did so because an > >>>> Computation Fluid Chemistry researcher from Sweden asked for help > >>>> speeding up his reciprocal calculations > >>>> (sqrt(1/(dx^2+dy^2+dz^2))), I found that by combining the 1/x and > >>>> the sqrt and doing three of them pipelind together (all the water > >>>> molecules having three atoms), his weeklong simulation runs ran > >>>> in half the time, on both PentiumPro and Alpha hardware. > >>> > >>> I, personally, have found many Newton-Raphson iterators that > >>> converge faster using 1/SQRT(x) than using the SQRT(x) > >>> equivalent. > >> > >> Yeah, that was eye-opening to me as well, to the level where I > >> consider the invsqrt() NR iteration as a mainstay, it can be useful > >> for both sqrt and 1/x as well. :-) > >> > >> Terje > >> > > > > What is this "SQRT(x) equivalent" all of you are talking about? > > I am not aware of any "direct" (i.e. not via RSQRT) NR-like method > > for SQRT that consists only of multiplicationa and additions. > > If it exists, I will be very interested to know. > > sqrt(x) <= x/sqrt(x) <= x*rsqrt(x) > > I.e. calculate rsqrt(x) to the precision you need and then do a > single fmul? > > Terje > That much I know. When precise rounding is not required, I even know slightly better "combined" method: 1. Do N-1 iterations for RSQRT delivering r0 with 2 or 3 more significant bits than n/2 2. Calculate SQRT estimate as y0 = x*r0 3. Do last iteration using both y0 and r0 as y = y0 + (x-y0*y0)*0.5*r0. That would give max. error of something like 0.51 ULP. A similar combined method could be useful for sw calculation of correctly rounded quad-precision sqrt as well. In this case it serves as 'conditionally last step' rather than 'absolutely last step'. I was hoping that you'll tell me about NR formula for SQRT itself that can be applied recursively. The only formula that I know is y = (y*y + x)/(y*2). Obviously, when speed matters this formula is not useful.