Article <va529m$1uo39$1@dont-email.me>

Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <va529m$1uo39$1@dont-email.me>

Deutsch English Français Italiano

<va529m$1uo39$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Stephen Fuld <sfuld@alumni.cmu.edu.invalid>
Newsgroups: comp.arch
Subject: Re: number of registers
Date: Wed, 21 Aug 2024 08:49:10 -0700
Organization: A noiseless patient Spider
Lines: 53
Message-ID: <va529m$1uo39$1@dont-email.me>
References: <v98asi$rulo$1@dont-email.me>
 <38055f09c5d32ab77b9e3f1c7b979fb4@www.novabbs.org>
 <v991kh$vu8g$1@dont-email.me>
 <e4352bad7240a6276e453226136ea0b3@www.novabbs.org>
 <va049n$2vnr7$1@dont-email.me>
 <a566ca0c8b5c41f402b60e8bac445e24@www.novabbs.org>
 <2024Aug20.090149@mips.complang.tuwien.ac.at>
 <a3a57791722f7c21c4218f5be6226e97@www.novabbs.org>
 <20240820204050.00003d56@yahoo.com>
 <48438024ccdbcc373e4cfa51d18066f5@www.novabbs.org>
 <2024Aug21.121312@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 21 Aug 2024 17:49:11 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="3e6a1e730eaf8f568c20cfe9a6f00305";
	logging-data="2056297"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+KbSJoqzfj35rcdnoS2gOeZ/0vCa6kyjA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:RGzCvBISD4vuA6H5/rRzpdKlIRw=
In-Reply-To: <2024Aug21.121312@mips.complang.tuwien.ac.at>
Content-Language: en-US
Bytes: 3878

On 8/21/2024 3:13 AM, Anton Ertl wrote:
> mitchalsup@aol.com (MitchAlsup1) writes:
>> The point is that the cost of not getting allocated into a register
>> is vastly lower--the count of instructions remains 1 while the
>> latency increases. That increase in latency does not hurt those
>> use once/seldom variables.
> 
> Latency is not the issue in modern high-performance AMD64 cores, which
> have zero-cycle store-to-load forwarding
> <http://www.complang.tuwien.ac.at/anton/memdep/>.
> 
> And yet, putting variables in registers gives a significant speedup:
> On a Rocket Lake, numbers are times in seconds:
> 
>   sieve bubble matrix   fib   fft
>   0.075  0.070  0.036 0.049 0.017 TOS in reg, RP in reg, IP in reg
>   0.100  0.149  0.054 0.106 0.037 TOS in mem, RP in mem, IP write-through to mem
> 
> In the first line, I used gforth-fast and tried to disable all
> optimizations except those that keep certain variables in registers:
> 
> gforth-fast --ss-states=1 --ss-number=31 --opt-ip-updates=0 onebench.fs
> 
> I could not reduce the static superinstructions below 31 and still get
> a result; I will have to investigate why, but that probably does not
> make that much of a difference for several of these benchmarks.
> 
> In the second line I used gforth, an engine that keeps the top of
> stack in memory, the return-stack pointer in memory, stores IP to
> memory after every change, and does not use static superinstructions,
> all for better identifying where an error happened.
> 
>> The the examples cited, the lack of register allocation triples
>> the instruction count due to lack of LD-OP and LD-OP-ST. The
>> register count I stated is how many registers would a
>> non-LD-OP machine need to break even on the instruction count.
> 
> What makes you think that instruction count is particularly relevant?
> Yes, you may save some decoding resources if you use LD-OP-ST on an
> architecture that supports it, but you first had to invest into a more
> complex decoder.  And in the OoO engine the difference may be gone (at
> least on Intel CPUs).

There are also some savings in reduced I-cache usage (possibly leading 
to higher I-cache hit rate), reduced memory I-fetch memory bandwidth 
required, etc, though these may be modest at best.




-- 
  - Stephen Fuld
(e-mail address disguised to prevent spam)