Deutsch English Français Italiano |
<vblq5k$2991r$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: kegs@provalid.com (Kent Dickey) Newsgroups: comp.arch Subject: Re: Decrement And Branch Date: Mon, 9 Sep 2024 03:31:00 -0000 (UTC) Organization: provalid.com Lines: 100 Message-ID: <vblq5k$2991r$1@dont-email.me> References: <v9f7b9$3qj3c$1@dont-email.me> <2024Aug14.111001@mips.complang.tuwien.ac.at> <c6653232ff022a7f991a061bfbf46ec3@www.novabbs.org> <2024Aug15.123928@mips.complang.tuwien.ac.at> Injection-Date: Mon, 09 Sep 2024 05:31:00 +0200 (CEST) Injection-Info: dont-email.me; posting-host="4031c0a48a9bc02472492b885dee20d2"; logging-data="2401339"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19VV6KM52BgMEKJMOcmyY+4" Cancel-Lock: sha1:eMnwa88oDGBedaRWJvTvVCbS4oU= Originator: kegs@provalid.com (Kent Dickey) X-Newsreader: trn 4.0-test76 (Apr 2, 2001) Bytes: 5283 In article <2024Aug15.123928@mips.complang.tuwien.ac.at>, Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote: >mitchalsup@aol.com (MitchAlsup1) writes: >>On Wed, 14 Aug 2024 9:10:01 +0000, Anton Ertl wrote: >> >>> Lawrence D'Oliveiro <ldo@nz.invalid> writes: >>>>Like I said, I wondered why this sort of thing wasn't more common ... [snip] >>My 66000 finds use cases all the time, and I also have Branch on bit >>instructions and have my CMP instructions build bit-vectors of outcomes. > >If an architecture has the 88000-style treatment of comparison results >(fill a GPR with conditions, one bit per condition), instructions like >TBNZ and TBZ certainly are useful, but ARM A64 uses a condition code >register with NZCV flags for dealing with conditions, so what is TBNZ >and TBZ used for on this architecture? Looking at a binary I have at >hand, I see a lot of checking bit #63 and some checking of #31, #15, >#7, i.e., checking for whether a 64-bit, ... 8-bit number is negative. >There are also a number of uses coming from libgcc, e.g., > > 6f0a8: 37e001c3 tbnz w3, #28, 6f0e0 ><__aarch64_sync_cache_range+0x50> > 6f0e8: 37e801e2 tbnz w2, #29, 6f124 ><__aarch64_sync_cache_range+0x94> > 6f6dc: b7980b84 tbnz x4, #51, 6f84c <__addtf3+0x71c> > 6fb28: b79000a3 tbnz x3, #50, 6fb3c <__addtf3+0xa0c> > 6fc30: b79000a3 tbnz x3, #50, 6fc44 <__addtf3+0xb14> > 70248: b7980d02 tbnz x2, #51, 703e8 <__multf3+0x728> > 7036c: b79809a2 tbnz x2, #51, 704a0 <__multf3+0x7e0> > 70430: b77801a2 tbnz x2, #47, 70464 <__multf3+0x7a4> > 7048c: b79ffae2 tbnz x2, #51, 703e8 <__multf3+0x728> > 70498: b79ffa82 tbnz x2, #51, 703e8 <__multf3+0x728> > >The tf3 stuff probably is the implementation of long doubles. In any >case, in this binary with 26473 instructions, there are 30 occurences >of tbnz and 41 of tbz, for a total of 71 (0.3% of static instruction >count). > >Apparently the usefulness of decrement-and-branch is even lower. > >Certainly in my code most loops count upwards. > >- anton >-- >'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' > Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com> PA-RISC had "ADDIB,cond,n imm,reg,target". Add a 5-bit signed immediate to reg, and then branch on comparing the result to 0 (effectively), allowing branching on <, <=, =, >, >=, overflow, carry, etc. And a non-immediate version ADDB. The target was +/-8KB. Really simple loops could be done with the loop operation in the delay slot of ADDIB. The HP C/C++ Compiler pretty much converted all for() loops to count down to 0, when it wasn't too awkward. So: for(i = 0; i < 100; i++) { array[i] = 0; } would be effectively transformed to: ptr = &array[0]; for(i = 99, i >= 0; i--) { *ptr++ = 0; } Which becomes (PA-RISC has target register listed last, and delay slots, and nullification where on branches it nullifies next instruction if it is not taken): MOV array,r8 LDI 99,r9 LOOP: ADDIB,>=,n -1,r9,LOOP ; r9=r9-1. If r9 >= 0, jump to LOOP STD,ma r0,8(r8) ; (r8)=r0; r8=r8+8 So it could use ADDIB for many "for" loops. The way nullification works, it works properly even if the loop should never execute. If r9 starts at 0, no STD will be done. There was no reason to change the source code, the compiler would do the transform for you. PA-RISC also had CMPIB which just does the compare and branch. ADDIB is a very simple instruction which costs very little to add, and saves 2 instructions for many loops (ADDI,CMP_0,Bcc -> ADDIB). I think it is a mistake for ARM to not have it. I see a lot of "ADD, CMP, Bcc" in ARM assembly code. To avoid inverting the counter, "ADD1CMPBcc" would ADD 1 to a counter, compare the counter to another register, and branch on condition. As for ARM TBNZ and TBZ, I see it used all the time in my code where I often use single bit flags in control variables: if(flags & FLAG_SPECIAL1) { // FLAG_SPECIAL1 = 0x40 // Do "SPECIAL1" stuff } In one program I've written on ARM, 2.3% of all instructions are TBZ or TBNZ. Kent