Deutsch English Français Italiano |
<vh0kdh$1qbro$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Brett <ggtgp@yahoo.com> Newsgroups: comp.arch Subject: Re: Reverse engineering of Intel branch predictors Date: Tue, 12 Nov 2024 22:21:37 -0000 (UTC) Organization: A noiseless patient Spider Lines: 48 Message-ID: <vh0kdh$1qbro$1@dont-email.me> References: <vfbfn0$256vo$1@dont-email.me> <c517f562a19a0db2f3d945a1c56ee2e6@www.novabbs.org> <jwv1q002k2s.fsf-monnier+comp.arch@gnu.org> <a3d81b5c64ce058ad21f42a8081162cd@www.novabbs.org> <jwvcyj1sefl.fsf-monnier+comp.arch@gnu.org> <abef7481ff0dd5d832cef0b9d3ea087a@www.novabbs.org> <jwv1pzhsahr.fsf-monnier+comp.arch@gnu.org> <8928500a87002966d6282465c037003e@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Tue, 12 Nov 2024 23:21:37 +0100 (CET) Injection-Info: dont-email.me; posting-host="73be693141fd5b2ae2341b01a67df8d0"; logging-data="1912696"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+JK1M3WSVXKp9Jo2Vv1kTB" User-Agent: NewsTap/5.5 (iPad) Cancel-Lock: sha1:mRdLcvKNwWLSaYSX3o1ndK6yZBU= sha1:6A7QYSDKAraPP7iB/yR7x6F2S5U= Bytes: 3113 MitchAlsup1 <mitchalsup@aol.com> wrote: > On Mon, 11 Nov 2024 22:10:14 +0000, Stefan Monnier wrote: > >>>> Hmm... but in order not to have bubbles, your prediction structure still >>>> needs to give you a predicted target address (rather than a predicted >>>> index number), right? >>> Yes, but you use the predicted index number to find the predicted >>> target IP. >> >> Hmm... but that would require fetching that info from memory. >> Can you do that without introducing bubbles? > > In many/most (dynamic) cases, they have already been fetched and all > that > is needed is muxing the indexed field out of Instruction Buffer. > >> If you're lucky it's in the L1 Icache, but that still takes a couple >> cycles to get, doesn't it? > > My 1-wide machine fetches 4-words per cycle. A two wide machine only adds a AGU and can do a load and math the same cycle, so how much bigger is that? > My 6-wide machine fetches 3 ½-cache-lines per cycle. > > Sure, if the indexed field is not already present, then you have to > go fetch it, but since the table immediately follows JTT, most of the > time, they have already arrived by the time JTT gets to DECODE. > >> Or do you have a dedicated "jump table cache" as part of your jump >> prediction tables? [ Even if you do, it still means your prediction >> has to first predict an index and then look it up in the table, which >> increases its latency. I don't know what kind of latency is used in >> current state of the art predictors, but IIUC any increase in latency >> can be quite costly. ] > > For the wider OoO machine, you will have something like a jump table > cache > hashed with some branch history and other data to "whiten" the address > space so one JTT table does not alias with another. >> >> >> Stefan >