Deutsch   English   Français   Italiano  
<2024Aug14.111001@mips.complang.tuwien.ac.at>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Decrement And Branch
Date: Wed, 14 Aug 2024 09:10:01 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 35
Message-ID: <2024Aug14.111001@mips.complang.tuwien.ac.at>
References: <v9f7b9$3qj3c$1@dont-email.me> <v9gl1b$30as$7@dont-email.me>
Injection-Date: Wed, 14 Aug 2024 11:44:40 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="83837fc91acd085cb9f62cf33fd5a0a3";
	logging-data="439675"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18lD64eq2Zm7VtWXsWo09Yj"
Cancel-Lock: sha1:vQ12KCCMQRNajOfFxQSG2J0x2gI=
X-newsreader: xrn 10.11
Bytes: 2521

Lawrence D'Oliveiro <ldo@nz.invalid> writes:
>Like I said, I wondered why this sort of thing wasn't more common ...

For the early RISCs, the pipeline was designed for early branch
execution.  Performing an ALU op before the branch did not fit that
kind of pipeline.

However, having a branch-and-subtract would have been possible.  But
how would that have interacted with the branch delay slots that many
of them had?  I guess one could perform the subtract before the
instruction in the delay slot, and take the branch afterwards (if it
is taken). 

So it would actually fit.  Why was it not done?  Maybe the idea was
that induction-variable elimination would usually eliminate the
subtract anyway, so why complicate the architecture with such an
instruction?

For over a decade, Intel decoders have decoded many sequences of ALU
and branch instructions into one uop, so they can do at a
microarchitectural level what you are asking about at the architecture
level.  Other microarchitectures have followed this pattern, and
RISC-V seems to make a philosophy out of this.

ARM A64 OTOH seems to put everything into an instruction that fits in
32 bits, and while they have instructions (TBNZ and TBZ) that tests a
specific bit in a register and branch if the bit is set or clear, they
have not added a subtract-and-branch or branch-and-subtract
instruction.  Apparently the uses for such an instruction are not that
frequent.

- anton
-- 
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>