Deutsch English Français Italiano |
<89f0ed31d4d152e748aa0a457f7eaf06@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Instruction Tracing Date: Sat, 10 Aug 2024 19:54:50 +0000 Organization: Rocksolid Light Message-ID: <89f0ed31d4d152e748aa0a457f7eaf06@www.novabbs.org> References: <2024Aug10.121802@mips.complang.tuwien.ac.at> <memo.20240810204133.20940T@jgd.cix.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="2127022"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Rslight-Site: $2y$10$nMxkK81b3L8tBeuAXvwyTesUFI7GkOvbWyAkruhzO3E2Rj/W08oRm X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 3298 Lines: 50 On Sat, 10 Aug 2024 19:41:00 +0000, John Dallman wrote: > In article <2024Aug10.121802@mips.complang.tuwien.ac.at>, > anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote: > >> IIRC IA-64 has no FP division. > > You recall correctly. It has an "approximation to reciprocal" > instruction, > which gives you about 8 bits of precision, and then requires the > compiler > to generate Newton-Raphson sequences. Intel's manual, 2010 edition, says > this is advantageous because users can generate only the precision they > need. Writing Itanium assembler for customised precision? Not many > people > would have wanted to do that in 2001, let alone 2010. > > In, I think, 1996, my employers had visitors from Intel trying to > persuade us to adopt their C/C++ compiler for IA-32. They had been able > to speed up one of our competitors' code by a factor of two, and hoped > to > do the same for us. > > They failed. We already had that factor of two, which was "ordinary > compiler optimisation." That competitor had some rather odd coding > standards at the time, which meant most compilers failed if asked to > optimise their code. Someone from Intel had stayed at their site for > most > of a year, reporting the bugs and getting them fixed until Intel's > compiler could optimise the code. > > While visiting us, Intel asked what may have been a significant question > about the mixture of floating-point arithmetic instructions we used. We > didn't have precise figures, but were sure that we used at least as many > square roots as divides. IA-64 does square roots like divides, with a > starter approximation and Newton-Raphson sequences. Slowly, because the > N-R instructions all depend on the previous instruction, and can't be > run in parallel. Newton-Raphson has 2 dependent multiplies in a dependent loop. Goldschmidt is a rearrangement of N-R such that the multiplies are independent with loop-to-loop dependencies. The way IA-64 did them it was 8 cycles per loop. Had they been done in function unit sequencing, instead, the loop would have only been 4 cycles. Converting to Goldschmidt it would have only been 2. Goldschmidt does not correct for arithmetic anomalies, whereas N-R does. Thus IEEE accurate Goldschmidt iterators use N-R as their last iteration. > > John