| Deutsch English Français Italiano |
|
<v4cpn6$1phq4$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!feed.opticnetworks.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Terje Mathisen <terje.mathisen@tmsw.no> Newsgroups: comp.arch Subject: Re: Privilege Levels Below User Date: Wed, 12 Jun 2024 20:34:13 +0200 Organization: A noiseless patient Spider Lines: 138 Message-ID: <v4cpn6$1phq4$1@dont-email.me> References: <jai66jd4ih4ejmek0abnl4gvg5td4obsqg@4ax.com> <h0ib6j576v8o37qu1ojrsmeb5o88f29upe@4ax.com> <2024Jun9.185245@mips.complang.tuwien.ac.at> <38ob6jl9sl3ceb0qugaf26cbv8lk7hmdil@4ax.com> <2024Jun10.091648@mips.complang.tuwien.ac.at> <o32f6jlq2qpi9s1u8giq521vv40uqrkiod@4ax.com> <3a691dbdc80ebcc98d69c3a234f4135b@www.novabbs.org> <k58h6jlvp9rl13br6v1t24t47t4t2brfiv@4ax.com> <5a27391589243e11b610b14c3015ec09@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Injection-Date: Wed, 12 Jun 2024 20:34:15 +0200 (CEST) Injection-Info: dont-email.me; posting-host="8ff6901c2e9b02e5825e75bced85b4a0"; logging-data="1886020"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jnunmY0jsUX3NC2U9D78fS8GHeOJl54Xfhq207hkDcA==" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2 Cancel-Lock: sha1:wUyPll8TC7frxgDwF9Cd00b9UEc= In-Reply-To: <5a27391589243e11b610b14c3015ec09@www.novabbs.org> Bytes: 7142 MitchAlsup1 wrote: > John Savard wrote: >=20 >> On Tue, 11 Jun 2024 00:27:02 +0000, mitchalsup@aol.com (MitchAlsup1) >> wrote: >=20 >>> ALL I have DONE is to not have the MB write into the cache until the >>> causing instruction retires !! >=20 >> I suppose that depends on how you define "write". >=20 > I mean the memory cell does not get modified. >=20 >> If by "write" you mean store data in the cache, for eventual writing >> out into RAM, well, since RAM doesn't contain "rename locations" to >> play with, it seems to me that any CPU designer had better do that. >=20 > The cache itself is not modified until the memory reference retires. > But there is a buffer holding the data which can be accessed as if > it were an L0 cache until the data migrates to the real cache at=20 > retirement. >=20 >> At least, I'm not imaginative enough to think of doing it any other >> way. >=20 >> However, if by "write" you mean to change the state of the cache in >> any way, such as by reading data from memory... now, _then_ you would >> indeed have done what is necessary to combat Spectre. >=20 > The cache is not modified, the data is available through another means.= > a means that can be backed up like a mispredicted branch. The buffer > I am talking about is temporally organized not spatially organized. >=20 >> Obviously, though, a "load" instruction will _never_ retire unless it >> can read the data from memory it is trying to put in a register. >=20 > The LD instruction can obtain data from either the buffer or from > the data cache itself. The buffer covers the execution window, > allowing the LD to retire (assuming every older instruction also > retires). >=20 >> So apparently WHAT you have REALLY DONE is to modify how memory reads >> work... >=20 > I pipelined them through a temporally organized memory execution > window. This also provides for allowing the memory system to run > OoO wrt program order, and detect actual ordering violations, and > rerun the memory references in a proper memory order by rerunning > the references in order. >=20 > You get relaxed memory order performance and precise memory order > simultaneously. >=20 >> if the data a load instruction requires is not already in the cache, >> then a direct read from memory=20 >=20 > The request is forwards towards memory through the cache hierarchy > and data arrives back at requestor (sooner or later). >=20 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 is performed which *completely >> bypasses* the cache;=20 >=20 > Yes, critical word first. >=20 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 this data (and its assoc= iated address) are >> retained by the CPU to be placed in the cache _if_ the instruction is >> actually executed and when it retires. >=20 > Yes !! While the data resides in the buffer, the whole line can be=20 > accessed by a number of memory reference instructions. >=20 >> And, in fact, the various cache levels have to work this way too. You >> have an L1 cache miss, but an L2 cache hit? Fine, you take your data >> directly from L2, and don't promote the data into L2 until instruction= >> retirement. >=20 > I use an exclusive cache organization. so data arriving at the CPU > goes into buffer, which upon retirement goes into L1, which has the > potential to push a L1->L2 line, and so forth. >=20 >> So now the process of fetching data from memory is _not_ done by >> fetching always from L1 and going _throughl_ L1 to access L2, and >> going _through_ L2 to access RAM, which seems to be the usual way >> these days. >=20 > Its back to the Athlon/Operon organizations. >=20 >> That certainly can be done. But it isn't quite as simple and obvious >> as you seem to claim. >=20 > If you had worked on them you can recognize the advantages and dis- > advantages. >=20 >>> My 66000 is also insensitive to RowHammer and derivatives..... >=20 >> When I first read that sentence, I was completely incredulous. DRAM is= >> sensitive to RowHammer because it's gone to feature sizes which are >> beyond the state-of-the-art to do properly... so corners have been >> cut. >=20 >> How a CPU can be "insensitive" to it was mysterious. >=20 >> After all, RowHammer is caused by multiple rapid-fire accesses to the >> same address, or to related addresses, in memory. >=20 > Yes, the write buffer in my DRAM controller is the L3 cache. Modified > data in the L3 migrates towards DRAM as DRAM cycles permit, but there > is no way to cause a line to be continuously be written into DRAM. > If a modified line has migrated to DRAM, and it gets modified again > in the L3, that 2nd write will not be performed until a refresh cycle > on that DRAM is performed. >=20 > Thus if one tries to RowHammer My 66000 DRAM, DRAM gets refresh cycle > between each write. Rowhammer can modify nearby lines, not just the ones that are being=20 hammered, right? How do you guarantee that all neighbors will also be=20 refreshed? Similarly, if the accesses are LOCK XADD operations, and you have=20 multiple CPUs (or cores not sharing a common last level cache, then I=20 don't see any way to avoid those accesses from making it all the way to=20 the RAM chips? Terje --=20 - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"