Article <v0gobh$3qnis$1@dont-email.me>

Deutsch English Français Italiano
<v0gobh$3qnis$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Fri, 26 Apr 2024 12:30:24 -0500
Organization: A noiseless patient Spider
Lines: 132
Message-ID: <v0gobh$3qnis$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
 <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
 <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
 <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
 <v017mg$3rcg9$1@dont-email.me>
 <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
 <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
 <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
 <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
 <ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
 <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
 <ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 26 Apr 2024 19:30:26 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="0ea82f9b9e39b8d196087c6b4e96eff4";
	logging-data="4021852"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/XZkCzJxPqyggoI97xCpLz5e66PeriR0s="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:K1BmjVQSzbJ4kJnuAORkPcb0KpQ=
Content-Language: en-US
In-Reply-To: <ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
Bytes: 6449

On 4/26/2024 8:25 AM, MitchAlsup1 wrote:
> BGB wrote:
> 
>> On 4/25/2024 4:01 PM, George Neuner wrote:
>>> On Tue, 23 Apr 2024 17:58:41 +0000, mitchalsup@aol.com (MitchAlsup1)
>>> wrote:
>>>
> 
>> Agreed in the sense that negative displacements exist.
> 
>> However, can note that positive displacements tend to be significantly 
>> more common than negative ones. Whether or not it makes sense to have 
>> a negative displacement, depending mostly on the probability of 
>> greater than half of the missed displacements being negative.
> 
>>  From what I can tell, this seems to be:
>>    ~ 10 bits, scaled.
>>    ~ 13 bits, unscaled.
> 
> 
>> So, say, an ISA like RISC-V might have had a slightly hit rate with 
>> unsigned displacements than with signed displacements, but if one 
>> added 1 or 2 bits, signed would have still been a clear winner (or, 
>> with 1 or 2 fewer bits, unsigned a clear winner).
> 
>> I ended up going with signed displacements for XG2, but it was pretty 
>> close to break-even in this case (when expanding from the 9-bit 
>> unsigned displacements in Baseline).
> 
> 
>> Granted, all signed or all-unsigned might be better from an ISA design 
>> consistency POV.
> 
> 
>> If one had 16-bit displacements, then unscaled displacements would 
>> make sense; otherwise scaled displacements seem like a win (misaligned 
>> displacements being much less common than aligned displacements).
> 
> What we need is ~16-bit displacements where 82½%-91¼% are positive.
> 

I was seeing stats more like 99.8% positive, 0.2% negative.


There was enough of a bias that, below 10 bits, if one takes all the 
remaining cases, zero extending would always win, until reaching 10 
bits, when the number of missed reaches 50% negative (along with 
positive displacements larger than 512).

So, one can make a choice: -512..511, or 0..1023, ...

In XG2, I ended up with -512..511, for pros or cons (for some programs, 
this choice is optimal, for others it is not).

Where, when scaled for QWORD, this is +/- 4K.


If one had a 16-bit displacement, it would be a choice between +/- 32K, 
or (scaled) +/- 256K, or 0..512K, ...

For the special purpose "LEA.Q (GBR, Disp16), Rn" instruction, I ended 
up going unsigned, where for a lot of the programs I am dealing with, 
this is big enough to cover ".data" and part of ".bss", generally used 
for arrays which need the larger displacements (the compiler lays things 
out so that most of the commonly used variables are closer to the start 
of ".data", so can use smaller displacements).

Does implicitly require that all non-trivial global arrays have at least 
64-bit alignment.


Note that seemingly both BGBCC and GCC have variations on this 
optimization, though in GCC's case it requires special command-line 
options ("-fdata-sections", etc).



> How does one use a frame pointer without negative displacements ??
> 
> [FP+disp] accesses callee save registers
> [FP-disp] accesses local stack variables and descriptors
> 
> [SP+disp] accesses argument and result values
> 

In my case, all of these are [SP+Disp], granted, there is no frame 
pointer and stack frames are fixed-size in BGBCC.

This is typically with a frame layout like:
   Argument/Spill space
   -- Frame Top
   Register Save
   (Stack Canary)
   Local arrays/structs
   Local variables
   Argument/Spill Space
   -- Frame Bottom

Contrast with traditional x86 layout, which puts saved registers and 
local variables near the frame-pointer, which points near the top of the 
stack frame.

Though, in a majority of functions, the MOV.L and MOV.Q functions have a 
big enough displacement to cover the whole frame (excludes functions 
which have a lot of local arrays or similar, though overly large local 
arrays are auto-folded to using heap allocation, but at present this 
logic is based on the size of individual arrays rather than on the total 
combined size of the stack frame).


Adding a frame pointer (with negative displacements) wouldn't make a big 
difference in XG2 Mode, but would be more of an issue for (pure) 
Baseline, where options are either to load the displacement into a 
register, or use a jumbo prefix.


>> But, admittedly, main reason I went with unscaled for GBR-rel and 
>> PC-rel Load/Store, was because using scaled displacements here would 
>> have required more relocation types (nevermind if the hit rate for 
>> unscaled 9-bit displacements is "pretty weak").
> 
>> Though, did end up later adding specialized Scaled GBR-Rel Load/Store 
>> ops (to improve code density), so it might have been better in 
>> retrospect had I instead just went the "keep it scaled and add more 
>> reloc types to compensate" option.
> 
> 
>> ....
> 
> 
>>> YMMV.