| Deutsch English Français Italiano |
|
<vbd91c$g5j0$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: jseigh <jseigh_es00@xemaps.com> Newsgroups: comp.arch Subject: Re: arm ldxr/stxr vs cas Date: Thu, 5 Sep 2024 17:49:32 -0400 Organization: A noiseless patient Spider Lines: 67 Message-ID: <vbd91c$g5j0$1@dont-email.me> References: <vb4sit$2u7e2$1@dont-email.me> <07d60bd0a63b903820013ae60792fb7a@www.novabbs.org> <vbc4u3$aj5s$1@dont-email.me> <898cf44224e9790b74a0269eddff095a@www.novabbs.org> <vbd4k1$fpn6$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Thu, 05 Sep 2024 23:49:33 +0200 (CEST) Injection-Info: dont-email.me; posting-host="b906cc1d9c010c5924a8284161c2d580"; logging-data="530016"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX193c6m1MAQmN59OK9tp2VrK" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:Lx9ygCJG+AEAxfiqE+ezyHAYkPk= Content-Language: en-US In-Reply-To: <vbd4k1$fpn6$1@dont-email.me> Bytes: 4278 On 9/5/24 16:34, Chris M. Thomasson wrote: > On 9/5/2024 12:46 PM, MitchAlsup1 wrote: >> On Thu, 5 Sep 2024 11:33:23 +0000, jseigh wrote: >> >>> On 9/4/2024 5:27 PM, MitchAlsup1 wrote: >>>> On Mon, 2 Sep 2024 17:27:57 +0000, jseigh wrote: >>>> >>>>> I read that arm added the cas instruction because they didn't think >>>>> ldxr/stxr would scale well. It wasn't clear to me as to why that >>>>> would be the case. I would think the memory lock mechanism would >>>>> have really low overhead vs cas having to do an interlocked load >>>>> and store. Unless maybe the memory lock size might be large >>>>> enough to cause false sharing issues. Any ideas? >>>> >>>> A pipeline lock between the LD part of a CAS and the ST part of a >>>> CAS is essentially FREE. But the same is true for LL followed by >>>> a later SC. >>>> >>>> Older machines with looser than sequential consistency memory models >>>> and running OoO have a myriad of problems with LL - SC. This is >>>> why My 66000 architecture switches from causal consistency to >>>> sequential consistency when it encounters <effectively> LL and >>>> switches bac after seeing SC. >>>> >>>> No Fences necessary with causal consistency. >>>> >>> >>> I'm not sure I entirely follow. I was thinking of the effects on >>> cache. In theory the SC could fail without having get the current >>> cache line exclusive or at all. CAS has to get it exclusive before >>> it can definitively fail. >> >> A LL that takes a miss in L1 will perform a fetch with intent to modify, >> so will a CAS. However, LL is allowed to silently fail if exclusive is >> not returned from its fetch, deferring atomic failure to SC, while CAS >> will fail when exclusive fails to return. > > CAS should only fail when the comparands are not equal to each other. > Well, then there is the damn weak and strong CAS in C++11... ;^o > > >> LL-SC is designed so that >> when a failure happens, failure is visible at SC not necessarily at LL. >> >> There are coherence protocols that allows the 2nd party to determine >> if it returns exclusive or not. The example I know is when the 2nd >> party is already performing an atomic event and it is better to fail >> the starting atomic event than to fail an ongoing atomic event. >> In My 66000 the determination is made under the notion of priority:: >> the higher priority thread is allows to continue while the lower >> priority thread takes the failure. The higher priority thread can >> be the requestor (1st party) or the holder of data (2nd party) >> while all interested observers (3rd parties) are in a position >> to see what transpired and act accordingly (causal). >> I'm not so sure about making the memory lock granularity same as cache line size but that's an implementation decision I guess. I do like the idea of detecting potential contention at the start of LL/SC so you can do back off. Right now the only way I can detect contention is after the fact when the CAS fails and I probably have the cache line exclusive at that point. It's pretty problematic. Joe Seigh