Deutsch   English   Français   Italiano  
<vp9gho$3b2j8$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Re: Cost of handling misaligned access
Date: Fri, 21 Feb 2025 03:20:53 -0600
Organization: A noiseless patient Spider
Lines: 84
Message-ID: <vp9gho$3b2j8$1@dont-email.me>
References: <5lNnP.1313925$2xE6.991023@fx18.iad> <vnosj6$t5o0$1@dont-email.me>
 <2025Feb3.075550@mips.complang.tuwien.ac.at> <volg1m$31ca1$1@dont-email.me>
 <voobnc$3l2dl$1@dont-email.me>
 <0fc4cc997441e25330ff5c8735247b54@www.novabbs.org>
 <vp0m3f$1cth6$1@dont-email.me>
 <74142fbdc017bc560d75541f3f3c5118@www.novabbs.org>
 <20250218150739.0000192a@yahoo.com>
 <0357b097bbbf6b87de9bc91dd16757e3@www.novabbs.org>
 <vp2sv2$1skve$1@dont-email.me>
 <a34ce3b43fab761d13b2432f9e255fab@www.novabbs.org>
 <vp518t$2bhib$1@dont-email.me>
 <a56e446b2e2df9f01eb558aa68279d35@www.novabbs.org>
 <vp5mnu$2fjhi$1@dont-email.me>
 <2dc33514bc664e667173b132601e6ce0@www.novabbs.org>
 <vp8kp3$334fu$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 21 Feb 2025 10:20:57 +0100 (CET)
Injection-Info: dont-email.me; posting-host="98b20ea62ba459119821c932ae14e520";
	logging-data="3508840"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18NxEwLxObaAez/wVEycwodKLiN4awFCYU="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:K8IIs57i8PHrsbgLZiFcAkiqAiY=
Content-Language: en-US
In-Reply-To: <vp8kp3$334fu$1@dont-email.me>
Bytes: 4962

On 2/20/2025 7:26 PM, Robert Finch wrote:
>>> Admittedly part of why I have such mixed feelings on full
>>> compare-and-branch:
>>>    Pro: It can offer a performance advantage (in terms of per-clock);
>>>    Con: Branch is now beholden to the latency of a Subtract.
>>      Con: it can't compare to a constant
>>      Con: it can't compare floating point
>>
> compare to constant, and floating point compares are supported in Q+ so 
> the cons are gone
> 
> constant compare is supported with the constant postfix. Because bits 
> are available in the instruction there is also a precision field. One 
> can compare bytes and branch for instance.
> 

Similar, just with a prefix encoding in my case, and support is optional.

I have recently also added it to BJX2, where it is now also possible to 
encode:
   JCMPxx  Rm, Imm, Disp8s  (XG1/XG2)
   Bxx     Rm, Imm, Disp10s (XG3)

Note that both use the same encoding space, Just XG2 kept the XG1 
encodings, and I can't change this detail without breaking binary 
compatibility (was able to change it for XG3 though, as XG3 is sort of 
its own thing encoding-wise; so I was free to try to fix up some of the 
worse hair here).


And:
   MOV.x   Imm17s, (Rm, Disp10s)

I was initially going to have Rm get turned into the immediate, but 
while this would have worked for JCMPxx/Bxx, it would not have worked 
for MOV.x, and it makes sense to have decoding consistency when possible.


So:
  BTSTT	Rm, Imm17s, Disp10s	//if((Imm&Rm)==0)
  BTSTF	Rm, Imm17s, Disp10s	//if((Imm&Rm)!=0)
  BGT	Rm, Imm17s, Disp10s	//if(Imm> Rm) || if(Rm< Imm)
  BLE	Rm, Imm17s, Disp10s	//if(Imm<=Rm) || if(Rm>=Imm)
  BGTU	Rm, Imm17s, Disp10s	//if(Imm> Rm) || if(Rm< Imm)
  BLEU	Rm, Imm17s, Disp10s	//if(Imm<=Rm) || if(Rm>=Imm)
  BEQ	Rm, Imm17s, Disp10s	//if(Imm==Rm)
  BNE	Rm, Imm17s, Disp10s	//if(Imm!=Rm)

Nevermind if the operand ordering is convoluted...
And, arguably, GT and LE may have better off had they been called LT and 
GE in this case, but...


It is basically encoded in a similar way to the 3RI Imm17s special case, 
just with the Rn register field rather than Ro/Rt (and operating on F1 
block instructions rather than F0 block). Not defined for the F2 block.

Not actually tested yet, still need to implement compiler support and 
similar.



BTW: I have now confirmed that my Verilog bugfix yesterday for XG3 has 
worked, so now Doom starts up and runs the demo loop. However, demos 
desync in a different way than usual, showing that there are still some 
issues.

Still no update on the remaining RV+Jx bug(s), poked at something to see 
if it changes anything. At the last cycle, it is crashing on an invalid 
memory access (causing a breakpoint in the TLB miss handler), which 
doesn't tell me much in the Verilog sim as to what was the cause of said 
memory access.


> Q+ supports all kinds of compares including those generating bit vectors 
> and SETxx, ZSETxx type compares. That is maybe its drawback, supporting 
> too many things.
> 

Yeah, it is a tradeoff. Too many edge cases leads to cost and debugging 
effort. Sometimes, better to try to not get too complicated.

My stuff is already getting annoyingly complicated and difficult to debug.