Deutsch English Français Italiano |
<62c9ee0c58580ea4988587aeb7660909@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!3.eu.feeder.erje.net!2.eu.feeder.erje.net!feeder.erje.net!newsfeed.bofh.team!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Wed, 4 Sep 2024 01:57:24 +0000 Organization: Rocksolid Light Message-ID: <62c9ee0c58580ea4988587aeb7660909@www.novabbs.org> References: <vajo7i$2s028$1@dont-email.me> <memo.20240827205925.19028i@jgd.cix.co.uk> <valki8$35fk2$1@dont-email.me> <2644ef96e12b369c5fce9231bfc8030d@www.novabbs.org> <vam5qo$3bb7o$1@dont-email.me> <2f1a154a34f72709b0a23ac8e750b02b@www.novabbs.org> <vaoqcf$3r1u3$1@dont-email.me> <vavgq7$12u29$1@dont-email.me> <vb002r$156ge$1@dont-email.me> <vb7stc$3fn7b$1@dont-email.me> <vb85qa$3h0j0$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="775986"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$oVDH62XEx68zSN87d0rok./Sf.g70NX0k5UpzG1cr7r6XBsJ46zom Bytes: 2915 Lines: 35 On Tue, 3 Sep 2024 23:23:50 +0000, BGB wrote: > On 9/1/2024 4:02 PM, Paul A. Clayton wrote: >> On 8/31/24 4:56 PM, BGB wrote: >> [snip] >>> I was mostly doing dual-issue with a 4R2W design. >>> >>> Initially, 6R3W won out mostly because 4R2W disallows an indexed store >>> to be run in parallel with another op; but 6R3W did allow this. >> >> Stores and MADD allow one register read to be delayed by at least >> one cycle. If the following cycle had a free read port, that could >> be stolen to complete the store/MADD. This could be viewed as >> cracking a three-source operation into a two-source operation and >> a one-source operation that reads source operands in a following >> cycle except that this operation never uses a result from the >> previous cycle. >> > > This wouldn't map well to my existing decoder/pipeline, which requires > all the ports (and all the registers) to be available at the time an > instruction enters EX1, and currently has no support for "cracking" an > instruction over multiple cycles, but may spread a single instruction > across multiple lanes. Your pipeline is amateur at best. -------------- > But, yeah, if the restriction only applied to indexed store (in the > current implementation, it applies to all stores), it would still be > around 4% of the total instruction stream. > > As-is, it is closer to 12%, and causing an extra penalty for 12% of the > total-executed instructions was undesirable (but, IMHO, still better > than needing to use multiple instructions). Delaying ST.data only delays LDs which alias that ST.