Deutsch English Français Italiano |
<v7h94s$3n21n$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.nobody.at!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Thomas Koenig <tkoenig@netcologne.de> Newsgroups: comp.arch Subject: Re: Faster div or 1/sqrt approximations Date: Sat, 20 Jul 2024 21:10:52 -0000 (UTC) Organization: A noiseless patient Spider Lines: 58 Message-ID: <v7h94s$3n21n$1@dont-email.me> References: <v6tbki$3g9rg$1@dont-email.me> <47689j5gbdg2runh3t7oq2thodmfkalno6@4ax.com> <v71vqu$gomv$9@dont-email.me> <116d9j5651mtjmq4bkjaheuf0pgpu6p0m8@4ax.com> <f8c6c5b5863ecfc1ad45bb415f0d2b49@www.novabbs.org> <7u7e9j5dthm94vb2vdsugngjf1cafhu2i4@4ax.com> <0f7b4deb1761f4c485d1dc3b21eb7cb3@www.novabbs.org> <v78soj$1tn73$1@dont-email.me> <v7dsf2$3139m$1@dont-email.me> <277c774f1eb48be79cd148dfc25c4367@www.novabbs.org> <v7ei4f$34uc2$1@dont-email.me> <v7gqgr$3kclj$1@dont-email.me> <v7grg1$3kf77$1@dont-email.me> <v7guln$3l7kj$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Sat, 20 Jul 2024 23:10:52 +0200 (CEST) Injection-Info: dont-email.me; posting-host="ccdff6c1e7e8e7cd4872288a041e1c0d"; logging-data="3901495"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18wJeNj+lTsToEowYosj3/JZUV0tBaTXt0=" User-Agent: slrn/1.0.3 (Linux) Cancel-Lock: sha1:gTsEjYNxwESR5GhHXCXNGZ1QM04= Bytes: 3628 Terje Mathisen <terje.mathisen@tmsw.no> schrieb: >> Wikipedia has something on that, also literature; Moroz et al. >> https://arxiv.org/abs/1603.04483 seem to give the optimum quasi-NR >> method to do this, with one or two steps. > > Impressive amounts of math here, unfortunately they completely miss the > point! > > What they derive is the exact optimal value for the magic constant, > using zero, one or two text-book NR iterations. > > However, if you are willing to take that first NR iteration > > halfnumber = 0.5f*x > ... > i = R-(i>>1); > ... > x = x*(1.5f-halfnumber*x*x); > > and then make both the 0.5f and 1.5f constants free variables, you can > in fact get 1.5 more bits than what they show in this paper. Wikipedia also has something about that... > In fact, using slightly different values for R (the magic constant) > results in very marginal changes in max rel error, while optimizing all > three constants improves that max error from the stated 1.75e-3 to about > 0.6e-3! Looks like https://web.archive.org/web/20180709021629/http://rrrola.wz.cz/inv_sqrt.html who reports 6.50196699E−4 as the maximum error (also from the Wikipedia article). That's 10.5 bits of accuracy, not bad at all. However... assume you want to do another NR step. In that case, you might be better off not loading different constants from memory, so having the same constants might actually be an advantage (whch does not mean that they have to be the original Newton steps). > >> >> I've also looked a little bit at simplifying exp(-1/x) directly, and >> it seems an unpleasent function to implement; at least there is no >> argument reduction which comes to mind. With exp2(x), you can split >> off the integer part and do an appoximation on the rest over a finite >> interval, but if there is a fast approximation of the integer part >> of 1/x without dividing, I am not aware of one :-) > >:-( > > Fast is probably what you get from a pure lookup table, but with no > obvious NR iteration to improve it. I stronlgy suspect you're fight, but it is too late in the evening to calculate this :-)