Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Memory ordering Date: Thu, 1 Aug 2024 19:57:24 +0000 Organization: Rocksolid Light Message-ID: <18ab7d4f4324a28ba0ab8bdb767a4261@www.novabbs.org> References: <2024Jul26.190007@mips.complang.tuwien.ac.at> <2032da2f7a4c7c8c50d28cacfa26c9c7@www.novabbs.org> <2024Jul29.152110@mips.complang.tuwien.ac.at> <2024Jul30.115146@mips.complang.tuwien.ac.at> <249b2217b1dc1c8911eb45c5735d4aa9@www.novabbs.org> <2024Aug1.175455@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="1135279"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Rslight-Site: $2y$10$hDONtCbzD.1W.hU17tVXpOJOA.kVsnacIFVxqNsiCtHlCm08g7Osy X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 3629 Lines: 52 On Thu, 1 Aug 2024 15:54:55 +0000, Anton Ertl wrote: > mitchalsup@aol.com (MitchAlsup1) writes: >>On Tue, 30 Jul 2024 9:51:46 +0000, Anton Ertl wrote: >> >>> mitchalsup@aol.com (MitchAlsup1) writes: > >>An MEMBAR requires the memory order to catch up to the current point >>before adding new AGENs to the problem space. If the memory order >>is already SC then MEMBAR has nothing to do and is pushed through >>the pipeline without delay. > > Yes, that's the slow implementation. The fast implementation is to > implement sequential consistency all the time (by predicting and > speculating that memory accesses do not interfer with those of other > cores, and recovering from that speculation when the speculation turns > out to be wrong). In such an implementation memory barriers are noops > (and thus fast), because the hardware already provides sequential > consistency. Why does SC need any MEMBARs ?? >>Then consider 2 Vector processors performing 2 STs (1 each) to >>non-overlapping addresses but with bank aliasing. Consider that >>the STs are scatter based and the back conflicts random. There >>is no way to determine which store happened first or which >>element of each vector store happened first. > > It's up to the architecture to define the order of stores and loads of > a given core. For sequential consistency you then interleave the > sequences coming from the cores in some convenient order. Insufficient:: If OoO processor orders LDs and STs as they leave AGEN you cannot just interleave multiple core access streams and achieve sequential consistency. > It does not > matter what happens earlier in some inertial system. It only matters > what your hardware decides should be treated as being earlier. The Causal consistency is determined by arrival order at the memory controller. Cache consistency is enforced by allowing a single line to be "in progress" only once in the system--each cache line is serially reusable, while other cache liens remain unordered wrt that one. > hardware has a lot of freedom here, but the end result as visible to > the cores must be sequentially consistent (or, with a weaker memory > consistency model, consistent with that model). > > - anton