Deutsch English Français Italiano |
<2024May14.073553@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.arch Subject: Re: Making Lemonade (Floating-point format changes) Date: Tue, 14 May 2024 05:35:53 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 58 Message-ID: <2024May14.073553@mips.complang.tuwien.ac.at> References: <abe04jhkngt2uun1e7ict8vmf1fq8p7rnm@4ax.com> <memo.20240512203459.16164W@jgd.cix.co.uk> <v1rab7$2vt3u$1@dont-email.me> <20240513151647.0000403f@yahoo.com> <v1tre1$3leqn$1@dont-email.me> <9c79fb24a0cf92c5fac633a409712691@www.novabbs.org> Injection-Date: Tue, 14 May 2024 08:02:41 +0200 (CEST) Injection-Info: dont-email.me; posting-host="9116fd8b3ded86b200f4808c5327fdef"; logging-data="23089"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+SuC7aSzd1QGXCCIdoRTCa" Cancel-Lock: sha1:0BthCikwBpSmaRKzb2xqeJqjomQ= X-newsreader: xrn 10.11 Bytes: 3628 mitchalsup@aol.com (MitchAlsup1) writes: >I recall that MIPS could emulate a TLB table walk in something like >19 cycles. That is:: a few cycles to get there, a hash table access, >a check, a TLB install, and a few cycles to get back. Which MIPS? R2000? R10000? Something else? Was this an inverted page table? >On an x86 this would be at least 200 cycles just getting there and back. Which x86? 8086? 80186? 80286? These (maybe the 8088 and V20, too) are the only implementations that deserve to be called x86. If you mean some IA-32 or AMD64 implementations, which ones? Anyway, let's see how this works for the U74 (a RISC-V implementation which apparently uses trapping for unaligned loads); here we have a 10M iteration loop with a payload that performs one load per iteration: [fedora-starfive:~/nfstmp/gforth-riscv:104544] perf stat -e instructions -e cycles gforth-fast -e ': foo 10000000 0 do @ loop ; 0 value x here aligned to x x x ! x foo drop bye' Performance counter stats for 'gforth-fast -e : foo 10000000 0 do @ loop ; 0 value x here aligned to x x x ! x foo drop bye': 223805151 instructions:u # 0.70 insn per cycle 318131306 cycles:u 0.352533487 seconds time elapsed 0.257061000 seconds user 0.064265000 seconds sys [fedora-starfive:~/nfstmp/gforth-riscv:104545] perf stat -e instructions -e cycles gforth-fast -e ': foo 10000000 0 do @ loop ; 0 value x here aligned 1+ to x x x ! x foo drop bye' Performance counter stats for 'gforth-fast -e : foo 10000000 0 do @ loop ; 0 value x here aligned 1+ to x x x ! x foo drop bye': 5329494415 instructions:u # 0.75 insn per cycle 7149481783 cycles:u 7.183239751 seconds time elapsed 7.082298000 seconds user 0.070121000 seconds sys So the unaligned access handling result in 511 additional instructions per load compared to an aligned access (so it obviously does the handling using some kind of trapping). Each unaligned access results in 683 additional cycles. So better use the unspecified MIPS, right? However, if the unspecified MIPS is an R2000, 19 cycles on a 12.5MHz R2000 cost 1.52us, whereas 683 cycles on a 1000MHz U74 cost 0.683us (and I have heard that in the Visionfive V2 the U74 runs at 1500MHz). - anton -- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>