Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: YASV (Yet Another Security Vulnearability) Date: Sat, 27 Jul 2024 21:01:19 +0000 Organization: Rocksolid Light Message-ID: <9d9ef61daa66f79d5e4efa3774ab806c@www.novabbs.org> References: <20240725104113.000006e8@yahoo.com> <2024Jul26.181750@mips.complang.tuwien.ac.at> <4a5db46ac6e0d485c3303cca4ccdf77e@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="578893"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Rslight-Site: $2y$10$cy/YsWapvnAOauAuO/.IR.lVt0Lxv.gvvt9mhYsPjYgGdXOInLKly Bytes: 6182 Lines: 111 On Sat, 27 Jul 2024 19:31:39 +0000, EricP wrote: > MitchAlsup1 wrote: >> On Fri, 26 Jul 2024 20:27:04 +0000, EricP wrote: >> >>> Anton Ertl wrote: >>>> EricP writes: >>>>> One thing they mention is Intel and AMD incorporating privilege level >>>>> tagging into the BTB, as I suggested when this all started. >>>>> Combine that with purging the user mode entries from the predictor >>>>> tables >>>>> on thread switch and I would think that would shut this all down. >>>> >>>> 1) The attacker can still attack the context (even if the notion of >>>> context includes the privilege level) from within itself. E.g., >>>> the kernel can be attacked by training the kernel-level branch >>>> prediction by performing appropriate system calls, and then >>>> performing a system call that reveals data through a >>>> mis-speculation side channel. IIRC such Spectre attacks have >>>> already been demonstrated years ago. >>> >>> I hadn't thought of this but yes, if JavaScript can contain remotely >>> exploitable gadgets then syscall might too. And its not just syscall >>> args >>> but any values that enter the kernel from outside that are used as >>> indexes >>> after bounds checking. So the image file mapper, network packets, etc. >>> >>> But if I recall correctly the fix for JavaScript was something like >>> a judiciously placed FENCE instruction to block speculation. >>> And for the kernel this attack surface should be quite small as all of >>> these values are already validated. >>> >>> So wouldn't it just be a matter of replacing certain kernel value >>> validation IF statements with IF_NO_SPECULATE? >>> >>>> 2) Users are supposedly not prepared to pay the cost of invisible >>>> speculation (-5-20%, depending on which paper you read) , are they >>>> prepared to pay the cost of purging the user-mode entries of branch >>>> predictors on thread switches? >>> >>> Its actually thread switches that also switch the process because if the >>> new thread is in the same process then there is no security domain >>> switch. >>> Plus that peer thread could likely make use of the old user mode >>> predictions. >>> >>> I have difficulty believing that the branch predictor values from some >>> thread in one process would be anything but a *negative* impact on a >>> random different thread in a different process. >> >> 47 threads in one process all crunching on one great big array/matrix. >> This will show almost complete positive impact on sharing the BP. > > And, as I said above, if the threads are in the same > process/address-space > then the BP should be preserved across that switch. But not if there > were > other intervening processes on the same core. Could be independent processes crunching on mmap() memory. >>> the predictor values then the new thread has to unlearn what it learned, >>> before it starts to learn values for the new thread. Whereas if the >>> predictor is flushed it can immediately learn its own values. >> >> The BP only has 4-states as 2-bits, anything you initialize its state >> to will take nearly as long to seed as a completely random table one >> inherits from the previous process. {{BTBs are different}} > > Admittedly I am going on intuition here but it is based on the > assumption > that a mispredicted taken branch that initiates a non-sequential fetch > is costlier than a mispredicted untaken branch that continues > sequentially. BTBs are supposed to get rid of much of the non-sequential Fetch delay. > In other words, assuming that resetting *ALL* branch predictors to > untaken, I think you mean weakly untaken: not just untaken. > not just conditional branches but indirect branches and CALL/RET too, Certainly CALL/RET as we are now in a completely different context. But note: in My 66000 switches are not indirect branches..... method calls and external subroutines are--these are easier to predict !! than switches. > and fetching sequentially is always cheaper than fetching off in random > directions at random points. Because fetching sequentially uses > resources > of I-TLB, I$L1 and prefetch buffer that are already loaded, Same motivation for My 66000 Predication scheme--do not disrupt the FETCHer. > whereas non-sequential mispredictions will initiate unnecessary loads > of essentially random information. I am sitting around wondering if ASID might be used to avoid resetting the predictors. If( myASID != storedASID ) don't use prediction. This avoids the need to set 32KB of 4-state predictors to weakly untaken. And then store 1 ASID per 512-bits of predictors, for a 3% overhead. Mc 68030 TLB did something like this. > It also depends on how quickly in the pipeline the mispredict can be > detected, some can be detected at Decode and others not until execute, > and how quickly unnecessary pending loads can be canceled and the > correct flow reestablished.