Deutsch English Français Italiano |
<c7cd7a9e4a8c14dab63f0b9394af4677@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Misc: Ongoing status... Date: Fri, 31 Jan 2025 19:30:44 +0000 Organization: Rocksolid Light Message-ID: <c7cd7a9e4a8c14dab63f0b9394af4677@www.novabbs.org> References: <vnglop$33lk0$1@dont-email.me> <cda6055929f89df81fb056509038afed@www.novabbs.org> <vnhrrj$3d7i0$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="2068907"; mail-complaints-to="usenet@i2pn2.org"; posting-account="o5SwNDfMfYu6Mv4wwLiW6e/jbA93UAdzFodw5PEa6eU"; User-Agent: Rocksolid Light X-Rslight-Site: $2y$10$c6dnmnAkKZNFM0Z8a2n9M.yfjiNcr9Rtd7qCqoNc7ROP0i6C.KXy2 X-Rslight-Posting-User: cb29269328a20fe5719ed6a1c397e21f651bda71 X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 10532 Lines: 268 On Fri, 31 Jan 2025 6:50:24 +0000, BGB wrote: > On 1/30/2025 5:48 PM, MitchAlsup1 wrote: >> On Thu, 30 Jan 2025 20:00:22 +0000, BGB wrote: >> >>> So, recent features added to my core ISA: None. >>> Reason: Not a whole lot that brings much benefit. >>> >>> >>> Have ended up recently more working on the RISC-V side of things, >>> because there are still gains to be made there (stuff is still more >>> buggy, less complete, and slower than XG2). >>> >>> >>> On the RISC-V side, did experiment with Branch-compare-Immediate >>> instructions, but unclear if I will carry them over: >>> Adds a non-zero cost to the decoder; >>> Cost primarily associated with dealing with a second immed. >>> Effect on performance is very small (< 1%). >> >> I find this a little odd--My 66000 has a lot of CPM #immed-BC >> a) so I am sensitive as this is break even wrt RISC-V >> b) But perhaps the small gains is due to something about >> .. how the pair runs down the pipe as opposed to how the >> .. single runs down the pipe. >> > > Issue I had seen is mostly, "How often does it come up?": > Seemingly, around 100-150 or so instructions between each occurrence on > average (excluding cases where the constant is zero; comparing with zero > being more common). > > What does it save: > Typically 1 cycle that might otherwise be spent loading the value into a > register (if this instruction doesn't end up getting run in parallel > with another prior instruction). > > > In the BGBCC output, the main case it comes up is primarily in "for()" > loops (followed by the occasional if-statement), so one might expect > this would increase its probability of having more of an effect. > > But, seemingly, not enough tight "for()" loops and similar in use for it > to have a more significant effect. > > So, in the great "if()" ranking: > if(x COND 0) ... //first place > if(x COND y) ... //second place > if(x COND imm) ... //third place > > However, a construct like: > for(i=0; i<10; i++) > { ... } > Will emit two of them, so they are not *that* rare either. Since the compiler can see that the loop is always executed; the first/top checking CMP-BC should not be emitted; leaving only 1. > Still, a lot rarer in use than: > val=ptr[idx]; > Though... > > Have noted though that simple constant for-loops are a minority, far > more often they are something like: > for(i=0; i<n; i++) > { ... } > Which doesn't use any. > > Or: > while(i--) > { ... } > Which uses a compare with zero (in RV, can be encoded with the zero I should note:: I have a whole class of conditional branches that include comparison to 0, {(signed), (unsigned), (float), (double)} and all 6 arithmetic comparands and auxiliary comparisons for NaNs and Infinities. > register; in BJX2 it has its own dedicated instruction due to the lack > of zero register; some of these were formally dropped in XG3 which does > have access to a zero register, and encoding an op using a ZR instead is > considered as preferable). I choose not to waste a register to hold zero. Once you have universal constants it is unnecessary. ------------------ > Huawei had a "less bad" encoding, but they burnt basically the entire > User-1 block on it, so that isn't going to fly. > > Generally, around 95% of the function-local branches can hit in a Disp9, > vs 98% for Disp12. So, better to drop to Disp9. DISP16 reaches farther... ------------------ >> >> I suggest a psychiatrist. >> > > People are pointing to charts gathered by mining binaries and being > like: "X10 and X11 are the two most commonly used registers". > > But, this is like pointing at x86 and being like: > "EAX and ECX are the top two registers, who needs such obscure registers > as ESI and EDI"?... > Quit listening to them, use your own judgement. > > >>> When I defined my own version of BccI (with a 64-bit encoding), how many >>> new instructions did I need to define in the 32-bit base ISA: Zero. >> >> How many 64-bit encodings did My 66000 need:: zero. >> {Hint the words following the instruction specifier have no internal >> format} >> > > I consider the combination of Jumbo-Prefix and Suffix instruction to be > a 64-bit instruction. I consider a multi-word instruction to have an instruction-specifier as the first 32-bits, and everything that follows is an attached constant. The only "prefixes" I have are CARRY and PREDication. ----------------------- > However, have noted that XG3 does appear to be faster than the original > Baseline/XG1 ISA. > > > Where, to recap: > XG1 (Baseline): > 16/32/64/96 bit encodings; > 16-bit ops can access R0..R15 with 4b registers; > Only 2R or 2RI forms for 16-bit ops; > 16-bit ISA still fairly similar to SuperH. > 5-bit register fields by default; > 6-bit available for an ISA subset. > Disp9u and Imm9u/n for most immediate form instructions; > 32 or 64 GPRs, Default 32. > 8 argument registers. > XG2: > 32/64/96 bit encodings; > All 16-bit encodings dropped. > 6-bit register fields (via a wonky encoding); > Same basic instruction format as XG1, > But, 3 new bits stored inverted in the HOB of instr words; > Mostly Disp10s and Imm10u/n; > 64 GPRs native; > 16 argument registers. > XG3: > Basically repacked XG2; > Can exist in same encoding space as RISC-V ops; > Aims for ease of compatibility with RV64G. > Encoding was made "aesthetically nicer" > All the register bits are contiguous and non-inverted; > Most immediate fields are also once again contiguous; > ... > Partly reworks branch instructions; > Scale=4, usually relative to BasePC (like RV); > Uses RV's register numbering space (and ABI); > Eg: SP at R2 vs R15, ... > (Partly carried over from XG2RV, which is now defunct). > 64 GPRs, but fudged into RV ABI rules; > Can't rebalance ABI without breaking RV compatibility; > Breaking RV compatibility defeating its point for existing. > 8 argument registers (because of RV ABI). > Could in theory expand to 16, but would make issues. > Despite being based on XG2, > BGBCC treats XG3 as an extension to RISC-V. > > > Then, RV: > 16/32; 48/64/96 (Ext) > Has 16-bit ops: ========== REMAINDER OF ARTICLE TRUNCATED ==========