Deutsch English Français Italiano |
<v1ceve$339t4$2@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Terje Mathisen <terje.mathisen@tmsw.no> Newsgroups: comp.arch Subject: Re: Byte Addressability And Beyond Date: Tue, 7 May 2024 07:42:06 +0200 Organization: A noiseless patient Spider Lines: 60 Message-ID: <v1ceve$339t4$2@dont-email.me> References: <v0s17o$2okf4$2@dont-email.me> <2024May3.171330@mips.complang.tuwien.ac.at> <v13olm$p9ih$9@dont-email.me> <2024May4.111127@mips.complang.tuwien.ac.at> <AnsZN.60734$gF_b.49289@fx17.iad> <v17dav$1o21q$2@dont-email.me> <YCNZN.77960$yf_8.61501@fx14.iad> <v18m0s$213qm$1@dont-email.me> <642cb7511c41a3931b06f747ea2161e4@www.novabbs.org> <v19d1t$2a6f8$1@dont-email.me> <v19gl2$2b44k$1@dont-email.me> <e10bcaa4cbad7113a2e630a236039c59@www.novabbs.org> <v1bigv$2q19c$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Injection-Date: Tue, 07 May 2024 07:42:06 +0200 (CEST) Injection-Info: dont-email.me; posting-host="22a2a16b103e3913d4a57a85e265af0a"; logging-data="3254180"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Htlf8BmIw/2zwbbBeoVQRkbeFtviAkAPRD4B76DWTSA==" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2 Cancel-Lock: sha1:K5tr7yyHQ8Nqpaofiqvt5ptKnVE= In-Reply-To: <v1bigv$2q19c$1@dont-email.me> Bytes: 3589 BGB wrote: > On 5/6/2024 2:11 PM, MitchAlsup1 wrote: >> Lawrence D'Oliveiro wrote: >> >>> On Sun, 5 May 2024 20:50:51 -0500, BGB wrote: >> >>>> Say, RISC-V: >>>> =C3=82=C2=A0=C3=82=C2=A0 Says yes to DIV and MOD; >>>> =C3=82=C2=A0=C3=82=C2=A0 Says yes to 4-register floating-point multi= ple-accumulate; Say=20 >>>> no to >>>> =C3=82=C2=A0=C3=82=C2=A0 register-indexed Load/Store. >>>> Me: This is not a good balance... >> >>> Multiply-accumulate is at least as much about reducing rounding error= =20 >>> as about speed. >> >> It is also an IEEE 754-2008+ requirement. >=20 > And... I have a version that just sort of works well enough to make=20 > RV64G work, but is sort of a fail on the other fronts: > =C2=A0 Using it is slower than separate ops; > =C2=A0 It produces a double-rounded result. > =C2=A0 Also, well, the FMUL isn't super accurate either. >=20 >=20 > FMUL is implemented in a way where it only generates the high-half of=20 > the multiply, which makes the FPU cheaper, but: > =C2=A0 Does not give strict 0.5ULP rounding. >=20 > Some combination of factors leads to the inability of Newton-Raphson to= =20 > fully converge, possibly either due to omitting the low-order multiplie= r=20 > results, or the carry-propagation limitation for rounding (if the=20 > rounding would result in more than 8 bits of carry, it is skipped). >=20 >=20 > Not likely to do proper FMA, as this would make a Binary64 FPU too=20 > expensive (and, doing Binary64 poorly is still preferable for most uses= =20 > to not doing it at all). >=20 > Granted, not entirely sure how the 8087 managed to do all the stuff tha= t=20 > it did. Since, it seems like an 80s ASIC would be more cramped than a=20 > modern Artix-7. Relatively easy to explain: It was _very_ slow, but still much faster=20 than emulating it with an 8088 that needed 4 clock cycles for every=20 single code or data byte touched. Terje --=20 - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"