Article <vgp4cd$6apa$1@paganini.bofh.team>

Deutsch English Français Italiano
<vgp4cd$6apa$1@paganini.bofh.team>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!eternal-september.org!feeder2.eternal-september.org!newsfeed.bofh.team!paganini.bofh.team!not-for-mail
From: antispam@fricas.org (Waldek Hebisch)
Newsgroups: comp.arch
Subject: Re: Reverse engineering of Intel branch predictors
Date: Sun, 10 Nov 2024 02:05:03 -0000 (UTC)
Organization: To protect and to server
Message-ID: <vgp4cd$6apa$1@paganini.bofh.team>
References: <vfbfn0$256vo$1@dont-email.me> <vg38o4$1mcfe$1@paganini.bofh.team> <jwvbjytwl4z.fsf-monnier+comp.arch@gnu.org> <vglj93$3mgpb$1@paganini.bofh.team> <2df7a7f589d13b4b712555d80a562de0@www.novabbs.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 10 Nov 2024 02:05:03 -0000 (UTC)
Injection-Info: paganini.bofh.team; logging-data="207658"; posting-host="WwiNTD3IIceGeoS5hCc4+A.user.paganini.bofh.team"; mail-complaints-to="usenet@bofh.team"; posting-account="9dIQLXBM7WM9KzA+yjdR4A";
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (Linux/6.1.0-9-amd64 (x86_64))
X-Notice: Filtered by postfilter v. 0.9.3
Bytes: 4587
Lines: 81

MitchAlsup1 <mitchalsup@aol.com> wrote:
> On Fri, 8 Nov 2024 17:54:45 +0000, Waldek Hebisch wrote:
> 
>> Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>>>> In case of branch predictor itself it means delay feedback by some
>>>> number of clocks, which looks like minor cost.
>>>
>>> You can still make your next predictions based on "architectural state
>>> + pending predictions" if the pending predictions themselves only
>>> depend ultimately on the architectural state.
>>>
>>>> OTOH delaying fetches from speculatively fetched addresses will
>>>> increase latency on critical path, possibly leading to
>>>> significant slowdown.
>>>
>>> I think you can similarly perform eagerly the fetches from speculatively
>>> fetched addresses but only if you can ensure that these will leave no
>>> trace if the speculation happens to fail.
>>
>> It looks extremaly hard if not impossible.
> 
> What kind of front end µArchitecture are you assuming that makes
> this hard (at all) ??
> 
> Seems to me that is there is an instruction buffer and you load the
> speculative instructions into it, you can speculatively execute them
> and throw them away if they were not supposed to execute. All you
> have to avoid is filling I Cache if you were not supposed to have
> fetched them.
> 
> Thus, not hard at all.

I assume that fetches (reads) from the cache may be multi-clock activity.
So, earlier access may slow down subsequent ones.  I do not know
details, but I looked at published data about Intel processors and
this seem to be most plausible explanation for their L2 timings.

IIUC data may go trough some crossbar switch which routes it to
correct cache bank.  It is not unusual for crossbars to have
switching delays, that is changing destination incurs say one
clock penalty.

I assume machine competitive on price and performance, which means
manufacturer will use slower blocks when they do not lower avarage
speed.  This may lead to slowdown of "real" activity due to
competion with speculative actions.

>>> So whether and how you can do it depends the definition of "leave no
>>> trace".  E.g. Mitch argues you can do it if you can refrain from putting
>>> that info into the normal cache (where it would have to displace
>>> something else, thus leaving a trace) and instead have to keep it in
>>> what we could call a "speculative cache" but would likely be just some
>>> sort of load buffer.
>>
>> Alone that is clearly insufficient.
> 
> Agreed insufficient all by itself but when combined...
> 
>>> If "leave no trace" includes not slowing down other concurrent memory
> 
> It does not.
> 
>>> accesses (e.g. from other CPUs), it might require some kind of
>>> priority scheme.
>>
>> First, one needs to ensure that the CPU performing speculative
>> fetch will not slown down due to say resource contention.  If you
>> put some arbitrary limit like one or two speculative fetches in
> 
> Here, you use the word fetch as if it were a LD instruction. Is
> that what you intended ?? {{I reserve Fetch for instruction fetches
> only}}

I mean any read access, both loads and instruction fetches.

>> flight, that is likely to be detectable by the attacker and may
>> leak information.  If you want several ("arbitrarily many") speculative
>> fetches without slowing down normal execution, that would mean highly
>> overprovisioned machine.

-- 
                              Waldek Hebisch