Deutsch   English   Français   Italiano  
<881c96e082af6e7ad3dfeaf292768cf4@www.novabbs.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Decrement And Branch
Date: Tue, 13 Aug 2024 17:15:00 +0000
Organization: Rocksolid Light
Message-ID: <881c96e082af6e7ad3dfeaf292768cf4@www.novabbs.org>
References: <v9f7b9$3qj3c$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="2459833"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Site: $2y$10$k8NgbUMxXw4jLlCQdrIeMOgNk4/2Qise6EEJ7UCRMU5qjET6HGcyS
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Bytes: 3259
Lines: 53

On Tue, 13 Aug 2024 9:00:25 +0000, Lawrence D'Oliveiro wrote:

> I thought loop-control instructions had fallen out of favour in the RISC
> era. But reading some IBM POWER (and PowerPC) docs has reminded me that
> that family does have such instructions. I don’t think any other RISC
> architecture does, though. POWER even has a special register (CTR, the
> “counter” register) for use with loop instructions, though it could also
> (along with LR, the “link” register) be used for indirect branches.
> (Obviously you need at least two registers with this property.)
>
> The original designers of POWER clearly thought there was a point to
> having such instructions; do you agree?

Yes, there is a point !

One can calculate ADD-CMP-BC in 1 gate delay longer than ADD. Thus,
the loop instruction can perform 3 instructions for you.

My 66000 has 3 looping instructions::
a) for( ; i<max; i++),
b) for( ; x != y; i++),
c) for( ; i<max && x ; i++)
With these almost every subroutine in /lib/str* and /lib/mem* vectorize.
>
> The most common form of these will decrement the counter register, and

I made mine go in either direction by allowing a constant as the loop
increment.

> only branch back to the top of the loop if the counter has not reached
> zero; if it is now zero, then fall through. However, the good old VAX
> (in
> its usual kitchen-sink fashion) had a whole set of variations, including
> one that decremented down to -1 instead of zero. And the Motorola 68000
> family only had the decrement down to -1 version.
>
> This seemed to mystify quite a few assembly-language programmers. I
> wonder
> why it wasn’t a more popular idea ...

VVM is based entirely on LOOP[123], and the architectural semantics
allows
this to provide for vectorization and SIMDization. Thus, My 66000 gets
2,000 instructions at the price of 2 actual instruction (4 if you are
picky)

A byte-copy loop can move 16-bytes per clock--effectivley 40
instructions
per clock (5/c if you could write it in 64-bit form--but you don't have
to write it in 64-bit form to get 64-bit performance. The above is on
an IO 1-wide machine. Multiply by 4 for the 6-wide OoO machine.

The logic is simple--these are frequent enough to warrant "doing a bit
more than 'nothing'" but not so much you crater the whole architecture.