| Deutsch English Français Italiano |
|
<46beb69a526eea8db9d741c6acca4482@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!news.quux.org!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Why I've Dropped In
Date: Thu, 12 Jun 2025 19:13:10 +0000
Organization: Rocksolid Light
Message-ID: <46beb69a526eea8db9d741c6acca4482@www.novabbs.org>
References: <0c857b8347f07f3a0ca61c403d0a8711@www.novabbs.com> <dd6e28b90190e249289add75780b204a@www.novabbs.com> <ec821d1d64555055271e3b72f241d39b@www.novabbs.com> <8addb3f96901904511fc9350c43917ef@www.novabbs.com> <102b5qh$1q55a$2@dont-email.me> <102bj2r$1tbgq$1@dont-email.me> <9af9fbbafd075777746866a5ff9165b1@www.novabbs.org> <102cmgv$25n56$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="233478"; mail-complaints-to="usenet@i2pn2.org";
posting-account="o5SwNDfMfYu6Mv4wwLiW6e/jbA93UAdzFodw5PEa6eU";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: cb29269328a20fe5719ed6a1c397e21f651bda71
X-Rslight-Site: $2y$10$meY4SZo4hW9cT665.elhWevDkz/uCNMUYCuq2f0hny0NYKD3Ad5mK
X-Spam-Checker-Version: SpamAssassin 4.0.0
On Wed, 11 Jun 2025 19:47:42 +0000, BGB wrote:
> On 6/11/2025 11:37 AM, MitchAlsup1 wrote:
>> On Wed, 11 Jun 2025 9:42:47 +0000, BGB wrote:
>>------------------
>
> LR: Functionally, in most ways the same as a GPR, but is assigned a
> special role and is assumed to have that role. Pretty much no one uses
> it as a base register though, with the partial exception of potential
> JALR wonk.
One can use JALR to call special subroutines that store multiple
registers
on the stack (or restore them later) wrapping prologue and Epilogue into
little subroutine calls that use a separate LR and thus have lower over-
head than a full blown call. Other than this use and some PDP-11-style
co-routines the explicit specification of LE is completely unnecessary.
> JALR X0, X1, 16 //not technically disallowed...
>
> If one uses the 'C' extension, assumptions about LR and SP are pretty
> solidly baked in to the ISA design.
>
>
> ZR: Always reads as 0, assignments are ignored; this behavior is very
> un-GPR-like.
>
> GP: Similar situation to LR, as it mostly looks like a GPR.
> In my CPU core and JX2VM, the high bits of GP were aliased to FPSR, so
> saving/restoring GP will also implicitly save/restore the dynamic
> rounding mode and similar (as opposed to proper RISC-V which has this
> stuff in a CSR).
With universal constants, you get this register back.
>
>
> Though, this isn't practically too much different from using the HOB's
> of captured LR values to hold the CPU ISA mode and similar (which my
> newer X3VM retains, though I am still on the fence about the "put FPSR
> bits into HOBs of GP" thing).
>
> Does mean that either dynamic rounding mode is lost every time a GP
> reload is done (though, only for the callee), or that setting the
> rounding mode also needs to update the corresponding PBO GP pointer
> (which would effectively make it semi-global but tied to each PE image).
>
> The traditional assumption though was that dynamic rounding mode is
> fully global, and I had been trying to make it dynamically scoped.
The modern interpretation is that the dynamic rounding mode can be set
prior to any FP instruction. So, you better be able to set it rapidly
and without pipeline drain, and you need to mark the downstream FP
instructions as dependent on this.
>
> So, it may be that having FPSR as its own thing, and then explicitly
> saving/restoring FPSR in functions that modify the rounding mode, may be
> a better option.
RM is separate from FPSR in My 66000, and uniquely accessible.
-----------------------
> Though, OTOH, Quake has stuff like:
> typedef float vec3_t[3];
> vec3_t v0, v1, v2;
> ...
> VectorCopy(v0, v1);
> Where VectorCopy is a macro that expands it out to something like, IIRC,
> do { v1[0]=v0[0]; v1[1]=v0[1]; v1[2]=v0[2]; } while(0);
>
> Where BGBCC will naively load each value, widen it to double, narrow it
> back to float, and store the result.
Sounds like you should be working on the compiler instead of
microarchitectures.