Article <c7cd7a9e4a8c14dab63f0b9394af4677@www.novabbs.org>

Deutsch English Français Italiano
<c7cd7a9e4a8c14dab63f0b9394af4677@www.novabbs.org>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Misc: Ongoing status...
Date: Fri, 31 Jan 2025 19:30:44 +0000
Organization: Rocksolid Light
Message-ID: <c7cd7a9e4a8c14dab63f0b9394af4677@www.novabbs.org>
References: <vnglop$33lk0$1@dont-email.me> <cda6055929f89df81fb056509038afed@www.novabbs.org> <vnhrrj$3d7i0$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="2068907"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="o5SwNDfMfYu6Mv4wwLiW6e/jbA93UAdzFodw5PEa6eU";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$c6dnmnAkKZNFM0Z8a2n9M.yfjiNcr9Rtd7qCqoNc7ROP0i6C.KXy2
X-Rslight-Posting-User: cb29269328a20fe5719ed6a1c397e21f651bda71
X-Spam-Checker-Version: SpamAssassin 4.0.0
Bytes: 10532
Lines: 268

On Fri, 31 Jan 2025 6:50:24 +0000, BGB wrote:

> On 1/30/2025 5:48 PM, MitchAlsup1 wrote:
>> On Thu, 30 Jan 2025 20:00:22 +0000, BGB wrote:
>>
>>> So, recent features added to my core ISA: None.
>>> Reason: Not a whole lot that brings much benefit.
>>>
>>>
>>> Have ended up recently more working on the RISC-V side of things,
>>> because there are still gains to be made there (stuff is still more
>>> buggy, less complete, and slower than XG2).
>>>
>>>
>>> On the RISC-V side, did experiment with Branch-compare-Immediate
>>> instructions, but unclear if I will carry them over:
>>>    Adds a non-zero cost to the decoder;
>>>      Cost primarily associated with dealing with a second immed.
>>>    Effect on performance is very small (< 1%).
>>
>> I find this a little odd--My 66000 has a lot of CPM #immed-BC
>> a) so I am sensitive as this is break even wrt RISC-V
>> b) But perhaps the small gains is due to something about
>> .. how the pair runs down the pipe as opposed to how the
>> .. single runs down the pipe.
>>
>
> Issue I had seen is mostly, "How often does it come up?":
> Seemingly, around 100-150 or so instructions between each occurrence on
> average (excluding cases where the constant is zero; comparing with zero
> being more common).
>
> What does it save:
> Typically 1 cycle that might otherwise be spent loading the value into a
> register (if this instruction doesn't end up getting run in parallel
> with another prior instruction).
>
>
> In the BGBCC output, the main case it comes up is primarily in "for()"
> loops (followed by the occasional if-statement), so one might expect
> this would increase its probability of having more of an effect.
>
> But, seemingly, not enough tight "for()" loops and similar in use for it
> to have a more significant effect.
>
> So, in the great "if()" ranking:
>    if(x COND 0) ...   //first place
>    if(x COND y) ...   //second place
>    if(x COND imm) ... //third place
>
> However, a construct like:
>    for(i=0; i<10; i++)
>      { ... }
> Will emit two of them, so they are not *that* rare either.

Since the compiler can see that the loop is always executed; the
first/top
checking CMP-BC should not be emitted; leaving only 1.

> Still, a lot rarer in use than:
>    val=ptr[idx];
> Though...
>
> Have noted though that simple constant for-loops are a minority, far
> more often they are something like:
>    for(i=0; i<n; i++)
>      { ... }
> Which doesn't use any.
>
> Or:
>    while(i--)
>      { ... }
> Which uses a compare with zero (in RV, can be encoded with the zero

I should note:: I have a whole class of conditional branches that
include comparison to 0, {(signed), (unsigned), (float), (double)}
and all 6 arithmetic comparands and auxiliary comparisons for NaNs
and Infinities.

> register; in BJX2 it has its own dedicated instruction due to the lack
> of zero register; some of these were formally dropped in XG3 which does
> have access to a zero register, and encoding an op using a ZR instead is
> considered as preferable).

I choose not to waste a register to hold zero. Once you have universal
constants it is unnecessary.
------------------
> Huawei had a "less bad" encoding, but they burnt basically the entire
> User-1 block on it, so that isn't going to fly.
>
> Generally, around 95% of the function-local branches can hit in a Disp9,
> vs 98% for Disp12. So, better to drop to Disp9.

DISP16 reaches farther...

------------------

>>
>> I suggest a psychiatrist.
>>
>
> People are pointing to charts gathered by mining binaries and being
> like: "X10 and X11 are the two most commonly used registers".
>
> But, this is like pointing at x86 and being like:
> "EAX and ECX are the top two registers, who needs such obscure registers
> as ESI and EDI"?...
>
Quit listening to them, use your own judgement.
>
>
>>> When I defined my own version of BccI (with a 64-bit encoding), how many
>>> new instructions did I need to define in the 32-bit base ISA: Zero.
>>
>> How many 64-bit encodings did My 66000 need:: zero.
>> {Hint the words following the instruction specifier have no internal
>> format}
>>
>
> I consider the combination of Jumbo-Prefix and Suffix instruction to be
> a 64-bit instruction.

I consider a multi-word instruction to have an instruction-specifier
as the first 32-bits, and everything that follows is an attached
constant.

The only "prefixes" I have are CARRY and PREDication.

-----------------------

> However, have noted that XG3 does appear to be faster than the original
> Baseline/XG1 ISA.
>
>
> Where, to recap:
>    XG1 (Baseline):
>      16/32/64/96 bit encodings;
>        16-bit ops can access R0..R15 with 4b registers;
>          Only 2R or 2RI forms for 16-bit ops;
>          16-bit ISA still fairly similar to SuperH.
>      5-bit register fields by default;
>        6-bit available for an ISA subset.
>      Disp9u and Imm9u/n for most immediate form instructions;
>      32 or 64 GPRs, Default 32.
>      8 argument registers.
>    XG2:
>      32/64/96 bit encodings;
>        All 16-bit encodings dropped.
>      6-bit register fields (via a wonky encoding);
>      Same basic instruction format as XG1,
>        But, 3 new bits stored inverted in the HOB of instr words;
>      Mostly Disp10s and Imm10u/n;
>      64 GPRs native;
>      16 argument registers.
>    XG3:
>      Basically repacked XG2;
>        Can exist in same encoding space as RISC-V ops;
>        Aims for ease of compatibility with RV64G.
>      Encoding was made "aesthetically nicer"
>        All the register bits are contiguous and non-inverted;
>        Most immediate fields are also once again contiguous;
>        ...
>      Partly reworks branch instructions;
>        Scale=4, usually relative to BasePC (like RV);
>      Uses RV's register numbering space (and ABI);
>        Eg: SP at R2 vs R15, ...
>        (Partly carried over from XG2RV, which is now defunct).
>      64 GPRs, but fudged into RV ABI rules;
>        Can't rebalance ABI without breaking RV compatibility;
>          Breaking RV compatibility defeating its point for existing.
>      8 argument registers (because of RV ABI).
>        Could in theory expand to 16, but would make issues.
>      Despite being based on XG2,
>        BGBCC treats XG3 as an extension to RISC-V.
>
>
> Then, RV:
>    16/32; 48/64/96 (Ext)
>    Has 16-bit ops:
========== REMAINDER OF ARTICLE TRUNCATED ==========