Path: ...!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: YASV (Yet Another Security Vulnearability) Date: Wed, 31 Jul 2024 14:56:55 +0000 Organization: Rocksolid Light Message-ID: <7af7c1c8a6650071b1b8a569d07b3379@www.novabbs.org> References: <20240725104113.000006e8@yahoo.com> <2024Jul26.181750@mips.complang.tuwien.ac.at> <2024Jul29.123405@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="996423"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$eysyUQpyrDWBMugyi4o0hOvv4rOxoTrSiYqpR3N3kkSWQGRMCmCgO X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 7986 Lines: 148 On Tue, 30 Jul 2024 18:27:14 +0000, EricP wrote: > Anton Ertl wrote: >> EricP writes: >> >>> I have difficulty believing that the branch predictor values from some >>> thread in one process would be anything but a *negative* impact on a >>> random different thread in a different process. >> >> This sounds very similar to the problem of aliasing of two different >> branches in the branch predictor. The branch predictor researchers >> have looked into that, and found that it does not pay off to tag >> predictions with the branches they are for. The aliased branch is at >> least as likely to benefit from the prediction as it is to suffer from >> interference; as a further measure agree predictors [sprangle+97] were >> proposed; I don't know if they ever made it into practical >> application. > > Yes I assume aliasing is possible as one source of erroneous > predictions. > I view the branch predictor (BP) as a black box attached to the Fetch > stage. > Fetch feeds BP with the current Fetch RIP virtual address (FetRipVir) > and > gets back a hit/miss signal, if a hit then the kind of branch/jump it is > supposed to be, and a target virtual address (TargRipVir) to fetch from > next. > > As the BP would use a subset of the FetRipVir bits to index its tables, > or the equivalent in the Branch History Register (BHR), > then its possible for BP to erroneously send Fetch off on a wild goose > chase, > triggering I-TLB table walks and/or I-cache misses. > > A similar effect to aliasing occurs on address space switch because the > table indexes for one virtual address space and PHT are completely > different. > > Then it becomes a matter of how quickly the mistake can be detected, > the previous path canceled and the correct path established, at what > cost. > >> As for the idea of erasing the branch predictor on process switch: >> >> Consider the case where your CPU-bound process has to make way for a >> short time slice of an I/O-bound process, and once that has submitted >> its next synchronous I/O request, your CPU-bound process gets control >> again. The I/O bound process tramples only over a small part of >> branch predictor state, but if you erase on process switch, all the >> branch preductor state will be gone when the CPU-bound process gets >> the CPU core again. That's the reason why we do not erase >> microarchitectural state on context switch; we do it neither for >> caches nor for branch predictors. > > Caches are not erased because they (a) usually are physically indexed > and > physically tagged and (b) use all physical address bits in the > index-tag. > If a cache is virtually indexed and tagged then it must be flushed on > address space switch, or entries also tagged with an ASID. > > Where branch predictors use addresses, they use fetch virtual addresses > and any tables indexed by those VA will be invalid in a different > process. > Also to save space they often don't use the full address bits but a > subset > which leads to aliasing of BP info for different instructions. > >> Moreover, another process will likely use some of the same libraries >> the earlier process used, and will benefit from having the branches in >> the library predicted (unless ASLR prevents them from using the same >> entries in the branch predictor). > > Even assuming this effect is significant I don't think it justifies > opening a security hole by retaining the BP tables, any more than it > would justify retaining the TLB for the prior address space. > >> >> @InProceedings{sprangle+97, >> author = {Eric Sprangle and Robert S. Chappell and Mitch Alsup >> and Yale N. Patt}, >> title = {The Agree Predictor: A Mechanism for Reducing >> Negative Branch History Interference}, >> crossref = {isca97}, >> pages = {284--291}, >> annote = {Reduces the number of conflict mispredictions by >> having the predictor entries predict whether or not >> some other predictor (say, a static predictor) is >> correct. This increases the chance that the >> predicted direction is correct in case of a >> conflict.} >> } >> >> @Proceedings{isca97, >> title = "$24^\textit{th}$ Annual International Symposium on Computer >> Architecture", >> booktitle = "$24^\textit{th}$ Annual International Symposium on >> Computer Architecture", >> year = "1997", >> key = "ISCA 24", >> } >> >>> Because if you retain >>> the predictor values then the new thread has to unlearn what it learned, >>> before it starts to learn values for the new thread. Whereas if the >>> predictor is flushed it can immediately learn its own values. >> >> Unlearn? The only thing I can think about in that direction is that a >> two-bit counter (for some history and maybe branch address) happens to >> be in a state where two instead of one misprediction is necessary >> before the prediction changes. Anyway, branch prediction research has >> looked into the issue a long time ago and found that erasing on >> context switch is a net loss. >> >> - anton > > In the above Agree Predictor the two-bit Pattern History Table (PHT) > is indexed by the multi-bit Branch History Table (BHT), > and the BHT must be retrained before it generates useful PHT indexes. > > The Branch Bias Table (BBT) is one bit indexed by the lower bits of the > Fetch RIP XOR'ed with the BHT. Even though this is only one bit to > toggle > to train it, the XOR with BHT means it too will only generate useful > indexes > to select that one bit after the BHT is retrained. Until then it will be > toggling the wrong bias bits. For the record, Sprangle and I did the Agree predictor on a machine with a "trace cache" where we assembled non-sequential instructions into a single fetch unit, and that fetch unit had bits that indicated whether the instructions were assembled with a taken branch or an \untaken branch. Then the 2-bit PHT predicted agreement with the way the fetch unit had been assembled (or not). The "lawyers" mad us invent the BBT in order to publish the paper; causing us to simulate the lawyered machine for the data in the paper. > > In other BP there are set associative Branch Target Buffers (BTB) > that remember the target virtual address a branch will go to. > Same for Indirect Branch Predictor, and CALL/RET stack predictor. > > All of these would repeatedly send Fetch off on a wild goose chases > until the current execution detects the mistakes, squashes any > instructions > fetched along the erroneous path, cancels any pending loads it > triggered, > and overwrites these entries.