Article <2024Jul21.151156@mips.complang.tuwien.ac.at>

Deutsch English Français Italiano
<2024Jul21.151156@mips.complang.tuwien.ac.at>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!2.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Faster div or 1/sqrt approximations
Date: Sun, 21 Jul 2024 13:11:56 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 34
Message-ID: <2024Jul21.151156@mips.complang.tuwien.ac.at>
References: <v6tbki$3g9rg$1@dont-email.me> <116d9j5651mtjmq4bkjaheuf0pgpu6p0m8@4ax.com> <f8c6c5b5863ecfc1ad45bb415f0d2b49@www.novabbs.org> <7u7e9j5dthm94vb2vdsugngjf1cafhu2i4@4ax.com> <0f7b4deb1761f4c485d1dc3b21eb7cb3@www.novabbs.org> <v78soj$1tn73$1@dont-email.me> <v7dsf2$3139m$1@dont-email.me> <277c774f1eb48be79cd148dfc25c4367@www.novabbs.org> <v7ei4f$34uc2$1@dont-email.me> <v7gqgr$3kclj$1@dont-email.me> <v7grg1$3kf77$1@dont-email.me> <v7guln$3l7kj$1@dont-email.me> <v7h94s$3n21n$1@dont-email.me>
Injection-Date: Sun, 21 Jul 2024 15:17:50 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="8cf45a02fd465630b2f75371802cc5d5";
	logging-data="124337"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/8JP1g9MvNlaMnFw4kIE20"
Cancel-Lock: sha1:fuM1ngACs2DW8E09NDfOyGjen8M=
X-newsreader: xrn 10.11
Bytes: 2754

Thomas Koenig <tkoenig@netcologne.de> writes:
>Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
>> However, if you are willing to take that first NR iteration
>>
>>   halfnumber = 0.5f*x
>>   ...
>>   i = R-(i>>1);
>>   ...
>>   x = x*(1.5f-halfnumber*x*x);
>>
>> and then make both the 0.5f and 1.5f constants free variables, you can 
>> in fact get 1.5 more bits than what they show in this paper.
....
>Looks like https://web.archive.org/web/20180709021629/http://rrrola.wz.cz/inv_sqrt.html
>who reports 6.50196699E−4 as the maximum error (also from the
>Wikipedia article).
>
>That's 10.5 bits of accuracy, not bad at all.
>
>However... assume you want to do another NR step.  In that case,
>you might be better off not loading different constants from memory,
>so having the same constants might actually be an advantage
>(whch does not mean that they have to be the original Newton steps).

The number of accurate digits doubles after each NR step, so starting
with the better first "NR" iteration would result in an additional
accuracy of 3 bits.  And if you optimize the new constants for the
second iteration, you may even get more.  Or you could optimize for
two iterations using the same constants ...

- anton
-- 
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>