Path: ...!2.eu.feeder.erje.net!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: jseigh Newsgroups: comp.arch Subject: Re: arm ldxr/stxr vs cas Date: Sun, 8 Sep 2024 07:53:59 -0400 Organization: A noiseless patient Spider Lines: 72 Message-ID: References: <07d60bd0a63b903820013ae60792fb7a@www.novabbs.org> <898cf44224e9790b74a0269eddff095a@www.novabbs.org> <352e80684e75a2c0a298b84e4bf840c4@www.novabbs.org> <7ca6928a45e4cae89ba50a4623809d1c@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sun, 08 Sep 2024 13:54:02 +0200 (CEST) Injection-Info: dont-email.me; posting-host="0c006793506dbfb5241b8d93b10abf45"; logging-data="2022152"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19C+Z7IInzxs+g5j7lOtATI" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:6RgqjbUeFe5Xn+QGg8Ooxq1mv2o= Content-Language: en-US In-Reply-To: Bytes: 4719 On 9/8/24 02:35, Chris M. Thomasson wrote: > On 9/7/2024 5:59 PM, MitchAlsup1 wrote: >> On Sat, 7 Sep 2024 23:16:35 +0000, Chris M. Thomasson wrote: >> >>> On 9/7/2024 4:14 PM, Chris M. Thomasson wrote: >>> [...] >>>> When I am using CAS I don't really expect it to fail willy nilly >>>> even if >>>> the comparands are still the same. Weak vs Strong. Still irks me a bit. >>>> ;^) >>> >>> There are algorithms out there, usually state machines that depend on >>> strong cas. When a CAS fails, it depends on it failing because the >>> comparands were actually different... >> >> Leading to ABA failures:: >> >> Do you really want the following CAS to succeed ?? >> >>      LD    R19,[someMemoryValue] >> .. >> interrupt delays program execution for 1 week >> .. >>      CAS   R17,R19,[someMemoryLocation] >> >> Given that the someMemoryLocation is accessible to other programs >> while tis one is sleeping ?? >> >> Thus, it seems reasonable to fail a CAS when one cannot determine >> if the memory location has been changed and changed back in the >> mean time. > > ABA, well that can happen with CAS and certain algorithms that use them. > The good ol' version counter is pretty nice, but it still can fail. 64 > bit words, 128 bit CAS. Loads to boot... ;^) Actually, I think Joe > mentioned something interesting about CAS a long time ago wrt IBM... > Candy Cane Striped books (Joe do you remember?) about a way to avoid > live lock and failing in the os. I think Windows has some like it with > their SList and SEH? You can have an ABA problem with single word CAS if your values (usually pointers) are one word in size. The solution to that is to use double word CAS, DWCAS, with the other word being a monotonic sequence number. The logic being that it would take longer to wrap and reuse the sequence number than any reasonable delay in execution. In the 70's s370 CDS was 64 bits (2 32 bit words) and it was estimated that it would take a year to wrap a 32 bit counter. The risc-v arch manual mentions that you need DWCAS or LL/SC to avoid the ABA problem and that they chose LL/SC to avoid dealing with 128 bit atomic data sizes (their loss but I'm not going into that here). Anyway if you want to avoid the ABA problem, you can't do it with C/C++ atomics since they don't support DWCAS. The candy striped Principle of Operation was an internal restricted document with architecture and engineering notes added. The note was that for short CAS loops fairness should be guaranteed so all processors could make equal forward progress. Speaking of live lock, in the 80's, VM/XA spin locks broke on a 4300 box. VMXA used a test and set loop without any back off. The 4300 TS microcode timed out getting a lock and failed silently. So all the cpu's spun forever trying to get a lock that was free. We sort of figured out when I provided a patch that put a load and test of the lock so we could put in a hardware break point to see if the lock was actually not free. It suddenly started working because the patch provide enough backoff for the microcode locking to start working again. Joe Seigh