Deutsch English Français Italiano |
<881c96e082af6e7ad3dfeaf292768cf4@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Decrement And Branch Date: Tue, 13 Aug 2024 17:15:00 +0000 Organization: Rocksolid Light Message-ID: <881c96e082af6e7ad3dfeaf292768cf4@www.novabbs.org> References: <v9f7b9$3qj3c$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="2459833"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$k8NgbUMxXw4jLlCQdrIeMOgNk4/2Qise6EEJ7UCRMU5qjET6HGcyS X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 3259 Lines: 53 On Tue, 13 Aug 2024 9:00:25 +0000, Lawrence D'Oliveiro wrote: > I thought loop-control instructions had fallen out of favour in the RISC > era. But reading some IBM POWER (and PowerPC) docs has reminded me that > that family does have such instructions. I don’t think any other RISC > architecture does, though. POWER even has a special register (CTR, the > “counter” register) for use with loop instructions, though it could also > (along with LR, the “link” register) be used for indirect branches. > (Obviously you need at least two registers with this property.) > > The original designers of POWER clearly thought there was a point to > having such instructions; do you agree? Yes, there is a point ! One can calculate ADD-CMP-BC in 1 gate delay longer than ADD. Thus, the loop instruction can perform 3 instructions for you. My 66000 has 3 looping instructions:: a) for( ; i<max; i++), b) for( ; x != y; i++), c) for( ; i<max && x ; i++) With these almost every subroutine in /lib/str* and /lib/mem* vectorize. > > The most common form of these will decrement the counter register, and I made mine go in either direction by allowing a constant as the loop increment. > only branch back to the top of the loop if the counter has not reached > zero; if it is now zero, then fall through. However, the good old VAX > (in > its usual kitchen-sink fashion) had a whole set of variations, including > one that decremented down to -1 instead of zero. And the Motorola 68000 > family only had the decrement down to -1 version. > > This seemed to mystify quite a few assembly-language programmers. I > wonder > why it wasn’t a more popular idea ... VVM is based entirely on LOOP[123], and the architectural semantics allows this to provide for vectorization and SIMDization. Thus, My 66000 gets 2,000 instructions at the price of 2 actual instruction (4 if you are picky) A byte-copy loop can move 16-bytes per clock--effectivley 40 instructions per clock (5/c if you could write it in 64-bit form--but you don't have to write it in 64-bit form to get 64-bit performance. The above is on an IO 1-wide machine. Multiply by 4 for the 6-wide OoO machine. The logic is simple--these are frequent enough to warrant "doing a bit more than 'nothing'" but not so much you crater the whole architecture.