Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <v816vg$317td$1@dont-email.me>
Deutsch   English   Français   Italiano  
<v816vg$317td$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.nobody.at!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Re: Arguments for a sane ISA 6-years later
Date: Fri, 26 Jul 2024 17:11:55 -0500
Organization: A noiseless patient Spider
Lines: 566
Message-ID: <v816vg$317td$1@dont-email.me>
References: <b5d4a172469485e9799de44f5f120c73@www.novabbs.org>
 <v7ubd4$2e8dr$1@dont-email.me>
 <034bc00e088a2cb40307e73ce30dcb2f@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 27 Jul 2024 00:12:02 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="9eaca427c10feecee898430925c61879";
	logging-data="3186605"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX183wjpx3+vYTfDUEI7aBfRHjyywg0TrM/A="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:UWUlwwi3T0K3Ck6Y4EarNAQcfUA=
In-Reply-To: <034bc00e088a2cb40307e73ce30dcb2f@www.novabbs.org>
Content-Language: en-US
Bytes: 20560

On 7/25/2024 5:07 PM, MitchAlsup1 wrote:
> On Thu, 25 Jul 2024 20:09:06 +0000, BGB wrote:
> 
>> On 7/24/2024 3:37 PM, MitchAlsup1 wrote:
>>> Just before Google Groups got spammed to death; I wrote::
>>> --------------------------------------------------------
>>> MitchAlsup
>>> Nov 1, 2022, 5:53:02 PM
>>>
>>> In a thread called "Arguments for a Sane Instruction Set Architecture"
>>> Aug 7, 2017, 6:53:09 PM I wrote::
>>> -----------------------------------------------------------------------
>>> Looking back over my 40-odd year career in computer architecture,
>>> I thought I would list out the typical errors I and others have
>>> made with respect to architecting computers. This is going to be
>>> a bit long, so bear with me:
>>>
>>> When the Instruction Set architecture is Sane, there is support
>>> for:
>>> A) negating operands prior to an arithmetic calculation.
>>
>> Not seen many doing this, and might not be workable in the general case.
>>    Might make sense for FPU ops like FADD/FMUL.
>>
>> Maybe 'ADD'. Though, "-(A+B)" is the only case that can't be expressed
>> with traditional ADD/SUB/RSUB.
> 
> a) one does not need a SUB or NEG instruction as one has:
>      ADD    Rd,R1,R2
>      ADD    Rd,R1,-R2
>      ADD    Rd,-R1,R2
>      ADD    Rd,-R1,-R2
> Which basically gets rid of the unary NEG instruction.

Possibly, but "RSUB Rm, 0, Rn" could also indirectly be used to encode 
"NEG Rm, Rn".


In terms of clock-cycles, as-is "NEG" is less than 0.01% of the total 
cycle budget, so eliminating it from being used likely wouldn't have any 
real noticeable effect on performance.


The main place NEG was being used in the past was for encoding "x>>y" 
via the "SHAD" instruction, but then I added a "SHAR" instruction which 
implicitly reverses the direction of the shift (and NEG is now only 
rarely used).

It does end up also used for encoding:
   "ptr-=size;"
As, say:
   NEG Rs, Rt
   LEA.Q (Rb, Rt), Rd

But, this is also relatively infrequent.


>>
>>
>>> B) providing constants from the instruction stream;
>>> ..where constant can be an immediate a displacement, or both.
>>
>> Probably true.
>>
>> My ISA allows for Immediate or Displacement to be extended, but doesn't
>> currently allow (in the base ISA) any instructions that can encode both
>> an immediate and displacement.
> 
> 
>      ST     #3.14159265358927,[IP,R3<<3,#0x123456789abcd]
> 
> Here we have 5 instruction words storing 2 words anywhere in memory in
> one instruction and one decode cycle; we waste no registers with the
> constants. Looks to be 7 instructions in RISC-V including 2 LDDs...
> 


Assuming the displacement were limited to 33 bits:
   MOV    3.14159265358927, R5  //Assuming this was valid ASM...
   LEA.Q   (PC, R3), R4
   MOV.Q   R5, (R4, 0x12345678)


Could in theory be reduced to 2 instructions via RiDisp, but only if the 
displacement is within 11 bits.
   MOV    3.14159265358927, R5
   MOV.Q  R5, (PC, R3, 0x123)

The RiDisp extension is not generally enabled though; failing to cross a 
"makes enough of a difference to be worth the cost" metric in my testing.


Assuming one really needs the full 64 bit displacement:
   MOV    3.14159265358927, R5
   LEA.Q  (PC, R3), R4
   ADD    0x123456789ABCD, R4
   MOV.Q  R5, (R4)



A full 64-bit displacement could be encoded in XG2 Mode as-is, but at 
present isn't valid in the ISA rules.

It could be made allowed to use an Imm57s encoding with a 48-bit 
displacement assuming the extension to allow 48-bit load/store 
displacements was enabled. But, as-is, doesn't make much difference and 
is bad for timing (not much point in 48-bit displacements when 0% of the 
displacements exceed 33 bits).



>>
>> At present:
>> Baseline allows Imm33s/Disp33s via a 64-bit encoding;
>> There is optional support for Imm57s, which in XG2 is now extended to
>> Imm64.
>>
>> There are special cases that allow immediate encodings for many
>> instructions that would otherwise lack an immediate encoding.
>>
>>
>>> C) exact floating point arithmetics that get the Inexact flag
>>> ..correctly unmolested.
>>
>> Dunno. I suspect the whole global FPU status/control register thing
>> should probably be rethought somehow.
>>
>> But, off-hand, don't know of a clearly better alternative.
>>
>>
>>> D) exception and interrupt control transfer should take no more
>>> ..than 1 cache line read followed by 4 cache line reads to the
>>> ..same page in DRAM/L3/L2 that are dependent on the first cache
>>> ..line read. Control transfer back to the suspended thread should
>>> ..be no longer than the control transfer to the exception handler.
>>
>> Likely expensive...
> 
> Tread "thread state" and its register file as a write back cache.


But, how to pull this off?...

If it requires building the whole register file out of FF's, this would 
be worse than building it out of Block-RAM, at least on FPGA.



As-is, I need to build the CRs out of FF's, and this is already rather 
expensive.

The only real other option is to have something that loads or stores the 
registers at 2 or so registers per clock-cycle, but this is what is 
already being done in software.


A hardware state-machine whose sole purpose is to bulk copy registers 
two/from a block of BRAM or similar upon interrupt entry/return is 
possible, but kinda lame.


>>
>>
>> Granted, "glorified branch with some twiddling" is probably a little too
>> far in the other direction. Interrupt and syscall overhead is fairly
>> high when the handler needs to manually save and restore all the
>> registers each time.
>>
>>
>> A fast, but more expensive, option would be to have multiple copies of
>> the register file which is then bank-switched on an interrupt.
> 
> Under My 66000 a low end implementation can choose the write back cache
> version, while the GBOoO implementation can choose the bank switcher.
> In both cases, the same model is presented to executing SW.

OK.
========== REMAINDER OF ARTICLE TRUNCATED ==========