Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <a0f443093d1a10de29650d34ac74a70e@www.novabbs.org>
Deutsch   English   Français   Italiano  
<a0f443093d1a10de29650d34ac74a70e@www.novabbs.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Reservation stations [was Continuations]
Date: Sun, 21 Jul 2024 19:44:49 +0000
Organization: Rocksolid Light
Message-ID: <a0f443093d1a10de29650d34ac74a70e@www.novabbs.org>
References: <v6tbki$3g9rg$1@dont-email.me> <47689j5gbdg2runh3t7oq2thodmfkalno6@4ax.com> <v71vqu$gomv$9@dont-email.me> <116d9j5651mtjmq4bkjaheuf0pgpu6p0m8@4ax.com> <f8c6c5b5863ecfc1ad45bb415f0d2b49@www.novabbs.org> <7u7e9j5dthm94vb2vdsugngjf1cafhu2i4@4ax.com> <0f7b4deb1761f4c485d1dc3b21eb7cb3@www.novabbs.org> <v78soj$1tn73$1@dont-email.me> <4bbc6af7baab612635eef0de4847ba5b@www.novabbs.org> <v792kn$1v70t$1@dont-email.me> <ef12aa647464a3ebe3bd208c13a3c40c@www.novabbs.org> <v79b56$20oq8$1@dont-email.me> <99f80e5c5452ec87cf6f5a70dcb33863@www.novabbs.org> <mDZlO.46777$BFg.42852@fx13.iad> <c59de6dee789fc98dda569caa3ad4157@www.novabbs.org> <lIanO.74059$oGQf.20914@fx10.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="4084875"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$bjuDPaeR2lVnimDzqMQMxO.xinJfAUH4AAe7FOTBR2WCCc40nfFtq
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
X-Spam-Checker-Version: SpamAssassin 4.0.0
Bytes: 5370
Lines: 83

On Sun, 21 Jul 2024 16:28:43 +0000, EricP wrote:

> MitchAlsup1 wrote:
>> On Thu, 18 Jul 2024 0:48:18 +0000, EricP wrote:
>>
>>> MitchAlsup1 wrote:
>>>>
>>>> {Would be an interesting reservation station design, though}
>>>
>>> In what way would the RS be interesting or different?
>>
>> The instruction stream consists of 4 FMAC-bound instructions unrolled
>> as many times as will fit in register file.
>>
>> You typical reservation station can accept 1 new instruction per cycle
>> from the decoder. So, either the decoder has to spew the instructions
>> across the stations (and remember they are data dependent) or the
>> station has to fire more than one per cycle to the FMAC units.
>>
>> So, instead of 1-in, 1-out per cycle, you need 4-in 4-out per cycle
>> and maybe some kind of exotic routing.
>
> This is where I saw a benefits to using valued reservation stations vs
> valueless ones - when a uArch has multiple similar FU each with its own
> bank of RS that is scheduled for that FU.
>
> Example of horizontal scaling of similar FU each with its own RS bank.
> https://i0.wp.com/chipsandcheese.com/wp-content/uploads/2024/07/cheese_oryon_diagram_revised.png
>
> With valueless RS, each RS stores only the source register number of
> its operands and each FU has to be able to read all its operands
> when a uOp launches (begins execution). This means the number of
> PRF read ports scales according to the total number of FU operands.
> (One could do read port sharing but then you have to schedule that too
> and could have contention.) Also if an FU is unused on any cycle then
> all its (expensive) operand read ports are unused.

I always had RSs keep tack of which FU was delivering the final
operand, so that these could be picked up by the forwarding logic
and not need a RF port. This gets rid of 50%-75% of the RF port
needs.
>
> Using the above Oryon as an example, with valueless RS, to launch
> all 14 FU with 3 operands all at once needs 42 read ports.
>
> With valued RS the operand values stored in each RS and, if ready,
> read at Dispatch (hand-off from the front end to the RS bank) or are
> received from the forwarding network if in-flight at Dispatch time.

Delivering result at dispatch time.

> The number of PRF read ports scales with the number of dispatched uOp
> operands. Since the operand values are stored in each RS, each bank
> can then schedule and launch independently.

The width of the decoder is narrower than the width of the data path.
We used to call this "catch up bandwidth".
>
> With valued RS, to Dispatch 6 wide with 3 operands needs 18 read ports,

First, a 6-wide machine is not doing 6 3-operand instructions,
it is more like 3-memory ops (2-reg+displacement), one 3-op,
one general 2-op, and one 1-op (branch) so, you only need 12-ports
instead of 18 Most of the time.

The penalty is that each RS entry is 5× the size of the value-free
RS designs. These work just fine when the execution window is
reasonable (say 96 instructions) but fails when the window is
larger than 150-ish.

> and the read ports are potentially usable for all dispatches.
> Then all 14 FU can launch at once independently.

One should also note that these machines deliver 1-2 I/c RMS
regardless of their Fetch-Decode-FU widths.
>
> Each FU can also have two kinds of valued RS banks,
> a simple one if all the operands are ready at Dispatch as this does
> not need a wake-up matrix entry or need to receive forwarded values,
> and a complex one that monitors the wake-up matrix and forwarding buses.
> If all the operands are ready, the Dispatcher can choose either RS bank
> for
> the FU, giving preference to the simpler. If all operands are not ready
> then Dispatcher selects from the complex bank.