Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Paul A. Clayton" Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Sun, 1 Sep 2024 17:02:16 -0400 Organization: A noiseless patient Spider Lines: 34 Message-ID: References: <2644ef96e12b369c5fce9231bfc8030d@www.novabbs.org> <2f1a154a34f72709b0a23ac8e750b02b@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Tue, 03 Sep 2024 22:51:57 +0200 (CEST) Injection-Info: dont-email.me; posting-host="a620a73ff5d72ac87d55127b0fd959d2"; logging-data="3661035"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+GOZujd8kKtNJJYSscxkgwff3XZowBN2A=" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.0 Cancel-Lock: sha1:+Ow7qjIfmqMzy5NGiA+95MvCo/I= In-Reply-To: Bytes: 3019 On 8/31/24 4:56 PM, BGB wrote: [snip] > I was mostly doing dual-issue with a 4R2W design. > > Initially, 6R3W won out mostly because 4R2W disallows an indexed > store to be run in parallel with another op; but 6R3W did allow > this. Stores and MADD allow one register read to be delayed by at least one cycle. If the following cycle had a free read port, that could be stolen to complete the store/MADD. This could be viewed as cracking a three-source operation into a two-source operation and a one-source operation that reads source operands in a following cycle except that this operation never uses a result from the previous cycle. In a VLIW, one could even imagine the register name for the delayed read being in the next instruction word if the available read port was always from using an immediate or having fewer source operands. This would add complexity for exceptions, branches, and even instruction cache misses. With a small buffer, a VLIW could also borrow from a previous cycle; an operation with one register source could include a "load into buffer" operation. (I do not recall ever reading about cross- cycle/-instruction-word register fields in any VLIW. While it seems to fit the VLIW model of static resource management, it breaks the "atomic" view of an instruction word and of the operation components — even borrowing within an instruction word seems not to have been considered.) Relying on forwarding or stealing from a future surplus would result in variable performance unless the opportunities were guaranteed (at least for enough cases that performance glitches would not be significant).