Article <abef7481ff0dd5d832cef0b9d3ea087a@www.novabbs.org>

Deutsch English Français Italiano
<abef7481ff0dd5d832cef0b9d3ea087a@www.novabbs.org>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Reverse engineering of Intel branch predictors
Date: Mon, 11 Nov 2024 21:23:00 +0000
Organization: Rocksolid Light
Message-ID: <abef7481ff0dd5d832cef0b9d3ea087a@www.novabbs.org>
References: <vfbfn0$256vo$1@dont-email.me> <c517f562a19a0db2f3d945a1c56ee2e6@www.novabbs.org> <jwv1q002k2s.fsf-monnier+comp.arch@gnu.org> <a3d81b5c64ce058ad21f42a8081162cd@www.novabbs.org> <jwvcyj1sefl.fsf-monnier+comp.arch@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="2022267"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="o5SwNDfMfYu6Mv4wwLiW6e/jbA93UAdzFodw5PEa6eU";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$D0XI4AFriluJeeqAWM0SLOORuRyOd.2JoCS0fXIf4pjbtoVfPIGa6
X-Rslight-Posting-User: cb29269328a20fe5719ed6a1c397e21f651bda71
X-Spam-Checker-Version: SpamAssassin 4.0.0
Bytes: 2705
Lines: 29

On Mon, 11 Nov 2024 20:36:50 +0000, Stefan Monnier wrote:

>>> I don't understand the "thus not needing prediction".  Loading IP from
>>> memory takes time, doesn't it?  Depending on your memory hierarchy and
>>> where the data is held, I'd say a minimum of 3 cycles and often more.
>>> What do you do during those cycles?
>> It is not that these things don't need prediction, it is that you do the
>> prediction and then verify the prediction using different data.
>
> I see, so you still need something similar to a BTB for operations like
> JTT, but the delay until you can verify the prediction is shorter, which
> should presumably reduce the cost of mispredictions.

Instead of verifying you got the right Target address (62-bits) you
can verify you picked the proper index from the table (8-ish bits).
So, it is shorter in the pipeline, and fewer bits to verify (and index
the tables with--less hashing,...)

>> For example: The classical way to do dense switches is a LD of the
>> target address and a jump to the target.  This requires verifying the
>> address of the target.  Whereas if you predict as JTT does, you verify
>> by matching the index number (which is known earlier and since the
>> table is read-only you don't need to verify the target address.
>
> Hmm... but in order not to have bubbles, your prediction structure still
> needs to give you a predicted target address (rather than a predicted
> index number), right?

Yes, but you use the predicted index number to find the predicted
target IP. And then verify the index later.