Article <ddfe16ae5b6b2fd1339602826246b849@www.novabbs.org>

Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <ddfe16ae5b6b2fd1339602826246b849@www.novabbs.org>

Deutsch English Français Italiano

<ddfe16ae5b6b2fd1339602826246b849@www.novabbs.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Making Lemonade (Floating-point format changes)
Date: Tue, 14 May 2024 15:19:34 +0000
Organization: Rocksolid Light
Message-ID: <ddfe16ae5b6b2fd1339602826246b849@www.novabbs.org>
References: <abe04jhkngt2uun1e7ict8vmf1fq8p7rnm@4ax.com> <memo.20240512203459.16164W@jgd.cix.co.uk> <v1rab7$2vt3u$1@dont-email.me> <20240513151647.0000403f@yahoo.com> <v1tre1$3leqn$1@dont-email.me> <9c79fb24a0cf92c5fac633a409712691@www.novabbs.org> <2024May14.073553@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="1085138"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$KUGRnMX1DPlK0bKoiD7WGukcgMDeNzpMk1XbYRWGi.oUhsFrUEVuu
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Bytes: 4058
Lines: 66

Anton Ertl wrote:

> mitchalsup@aol.com (MitchAlsup1) writes:
>>I recall that MIPS could emulate a TLB table walk in something like
>>19 cycles. That is:: a few cycles to get there, a hash table access,
>>a check, a TLB install, and a few cycles to get back.

> Which MIPS?  R2000? R10000? Something else? Was this an inverted page
> table?

R3000 and it was a hast table ~1MB in size.

>>On an x86 this would be at least 200 cycles just getting there and back.

> Which x86?  8086?  80186?  80286?  These (maybe the 8088 and V20, too)
> are the only implementations that deserve to be called x86.  If you
> mean some IA-32 or AMD64 implementations, which ones?

> Anyway, let's see how this works for the U74 (a RISC-V implementation
> which apparently uses trapping for unaligned loads); here we have a
> 10M iteration loop with a payload that performs one load per
> iteration:

> [fedora-starfive:~/nfstmp/gforth-riscv:104544] perf stat -e instructions -e cycles gforth-fast -e ': foo 10000000 0 do @ loop ; 0 value x here aligned to x x x ! x foo drop bye'

>  Performance counter stats for 'gforth-fast -e : foo 10000000 0 do @ loop ; 0 value x here aligned to x x x ! x foo drop bye':

>          223805151      instructions:u            #    0.70  insn per cycle
>          318131306      cycles:u

>        0.352533487 seconds time elapsed

>        0.257061000 seconds user
>        0.064265000 seconds sys


> [fedora-starfive:~/nfstmp/gforth-riscv:104545] perf stat -e instructions -e cycles gforth-fast -e ': foo 10000000 0 do @ loop ; 0 value x here aligned 1+ to x x x ! x foo drop bye'

>  Performance counter stats for 'gforth-fast -e : foo 10000000 0 do @ loop ; 0 value x here aligned 1+ to x x x ! x foo drop bye':

>         5329494415      instructions:u            #    0.75  insn per cycle
>         7149481783      cycles:u

>        7.183239751 seconds time elapsed

>        7.082298000 seconds user
>        0.070121000 seconds sys

> So the unaligned access handling result in 511 additional instructions
> per load compared to an aligned access (so it obviously does the
> handling using some kind of trapping).  Each unaligned access results
> in 683 additional cycles.

Yes, but notice sys time hardly changes, so, RISC-V is performing the
misaligned LD in user mode (2 context switches -- likely somewhat light
weight).

> So better use the unspecified MIPS, right?  However, if the
> unspecified MIPS is an R2000, 19 cycles on a 12.5MHz R2000 cost
> 1.52us, whereas 683 cycles on a 1000MHz U74 cost 0.683us (and I have
> heard that in the Visionfive V2 the U74 runs at 1500MHz).

Given at least the same cache footprint a 2GHz R3000 would still be 
in the 20-cycle range. {{That 19 cycle TLB reload is dependent on
the handler and its table have a footprint in the cache(s).

> - anton