Path: ...!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: YASV (Yet Another Security Vulnearability)
Date: Wed, 31 Jul 2024 14:56:55 +0000
Organization: Rocksolid Light
Message-ID: <7af7c1c8a6650071b1b8a569d07b3379@www.novabbs.org>
References: <v7rqbf$1ta84$1@dont-email.me> <20240725104113.000006e8@yahoo.com> <tJtoO.87238$BYv6.980@fx09.iad> <2024Jul26.181750@mips.complang.tuwien.ac.at> <hFToO.27997$iptd.11865@fx36.iad> <2024Jul29.123405@mips.complang.tuwien.ac.at> <hgaqO.2$qO%5.1@fx16.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="996423"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Site: $2y$10$eysyUQpyrDWBMugyi4o0hOvv4rOxoTrSiYqpR3N3kkSWQGRMCmCgO
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Bytes: 7986
Lines: 148

On Tue, 30 Jul 2024 18:27:14 +0000, EricP wrote:

> Anton Ertl wrote:
>> EricP <ThatWouldBeTelling@thevillage.com> writes:
>>
>>> I have difficulty believing that the branch predictor values from some
>>> thread in one process would be anything but a *negative* impact on a
>>> random different thread in a different process.
>>
>> This sounds very similar to the problem of aliasing of two different
>> branches in the branch predictor.  The branch predictor researchers
>> have looked into that, and found that it does not pay off to tag
>> predictions with the branches they are for.  The aliased branch is at
>> least as likely to benefit from the prediction as it is to suffer from
>> interference; as a further measure agree predictors [sprangle+97] were
>> proposed; I don't know if they ever made it into practical
>> application.
>
> Yes I assume aliasing is possible as one source of erroneous
> predictions.
> I view the branch predictor (BP) as a black box attached to the Fetch
> stage.
> Fetch feeds BP with the current Fetch RIP virtual address (FetRipVir)
> and
> gets back a hit/miss signal, if a hit then the kind of branch/jump it is
> supposed to be, and a target virtual address (TargRipVir) to fetch from
> next.
>
> As the BP would use a subset of the FetRipVir bits to index its tables,
> or the equivalent in the Branch History Register (BHR),
> then its possible for BP to erroneously send Fetch off on a wild goose
> chase,
> triggering I-TLB table walks and/or I-cache misses.
>
> A similar effect to aliasing occurs on address space switch because the
> table indexes for one virtual address space and PHT are completely
> different.
>
> Then it becomes a matter of how quickly the mistake can be detected,
> the previous path canceled and the correct path established, at what
> cost.
>
>> As for the idea of erasing the branch predictor on process switch:
>>
>> Consider the case where your CPU-bound process has to make way for a
>> short time slice of an I/O-bound process, and once that has submitted
>> its next synchronous I/O request, your CPU-bound process gets control
>> again.  The I/O bound process tramples only over a small part of
>> branch predictor state, but if you erase on process switch, all the
>> branch preductor state will be gone when the CPU-bound process gets
>> the CPU core again.  That's the reason why we do not erase
>> microarchitectural state on context switch; we do it neither for
>> caches nor for branch predictors.
>
> Caches are not erased because they (a) usually are physically indexed
> and
> physically tagged and (b) use all physical address bits in the
> index-tag.
> If a cache is virtually indexed and tagged then it must be flushed on
> address space switch, or entries also tagged with an ASID.
>
> Where branch predictors use addresses, they use fetch virtual addresses
> and any tables indexed by those VA will be invalid in a different
> process.
> Also to save space they often don't use the full address bits but a
> subset
> which leads to aliasing of BP info for different instructions.
>
>> Moreover, another process will likely use some of the same libraries
>> the earlier process used, and will benefit from having the branches in
>> the library predicted (unless ASLR prevents them from using the same
>> entries in the branch predictor).
>
> Even assuming this effect is significant I don't think it justifies
> opening a security hole by retaining the BP tables, any more than it
> would justify retaining the TLB for the prior address space.
>
>>
>> @InProceedings{sprangle+97,
>>   author = 	 {Eric Sprangle and Robert S. Chappell and Mitch Alsup
>>                   and Yale N. Patt},
>>   title = 	 {The Agree Predictor: A Mechanism for Reducing
>>                   Negative Branch History Interference},
>>   crossref =	 {isca97},
>>   pages =	 {284--291},
>>   annote =	 {Reduces the number of conflict mispredictions by
>>                   having the predictor entries predict whether or not
>>                   some other predictor (say, a static predictor) is
>>                   correct. This increases the chance that the
>>                   predicted direction is correct in case of a
>>                   conflict.}
>> }
>>
>> @Proceedings{isca97,
>>   title = 	 "$24^\textit{th}$ Annual International Symposium on Computer
>> Architecture",
>>   booktitle = 	 "$24^\textit{th}$ Annual International Symposium on
>> Computer Architecture",
>>   year = 	 "1997",
>>   key =		 "ISCA 24",
>> }
>>
>>> Because if you retain
>>> the predictor values then the new thread has to unlearn what it learned,
>>> before it starts to learn values for the new thread. Whereas if the
>>> predictor is flushed it can immediately learn its own values.
>>
>> Unlearn?  The only thing I can think about in that direction is that a
>> two-bit counter (for some history and maybe branch address) happens to
>> be in a state where two instead of one misprediction is necessary
>> before the prediction changes.  Anyway, branch prediction research has
>> looked into the issue a long time ago and found that erasing on
>> context switch is a net loss.
>>
>> - anton
>
> In the above Agree Predictor the two-bit Pattern History Table (PHT)
> is indexed by the multi-bit Branch History Table (BHT),
> and the BHT must be retrained before it generates useful PHT indexes.
>
> The Branch Bias Table (BBT) is one bit indexed by the lower bits of the
> Fetch RIP XOR'ed with the BHT. Even though this is only one bit to
> toggle
> to train it, the XOR with BHT means it too will only generate useful
> indexes
> to select that one bit after the BHT is retrained. Until then it will be
> toggling the wrong bias bits.

For the record, Sprangle and I did the Agree predictor on a machine
with a "trace cache" where we assembled non-sequential instructions
into a single fetch unit, and that fetch unit had bits that indicated
whether the instructions were assembled with a taken branch or an
\untaken branch. Then the 2-bit PHT predicted agreement with the
way the fetch unit had been assembled (or not).

The "lawyers" mad us invent the BBT in order to publish the paper;
causing us to simulate the lawyered machine for the data in the
paper.
>
> In other BP there are set associative Branch Target Buffers (BTB)
> that remember the target virtual address a branch will go to.
> Same for Indirect Branch Predictor, and CALL/RET stack predictor.
>
> All of these would repeatedly send Fetch off on a wild goose chases
> until the current execution detects the mistakes, squashes any
> instructions
> fetched along the erroneous path, cancels any pending loads it
> triggered,
> and overwrites these entries.