Path: ...!weretis.net!feeder9.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Chris M. Thomasson" Newsgroups: comp.arch Subject: Re: arm ldxr/stxr vs cas Date: Mon, 9 Sep 2024 14:09:51 -0700 Organization: A noiseless patient Spider Lines: 64 Message-ID: References: <07d60bd0a63b903820013ae60792fb7a@www.novabbs.org> <898cf44224e9790b74a0269eddff095a@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Mon, 09 Sep 2024 23:09:51 +0200 (CEST) Injection-Info: dont-email.me; posting-host="a3d1d68c17abf34994c0af4a498168c2"; logging-data="2686828"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18jLpvEZuwqOOG2bJOLJt6AMlQTineMpMI=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:9Tcc2tL/K8DbQbDdWV0508cBkCE= In-Reply-To: Content-Language: en-US Bytes: 3495 On 9/9/2024 7:29 AM, jseigh wrote: > On 9/9/24 03:14, Terje Mathisen wrote: >> jseigh wrote: >>> >>> I'm not so sure about making the memory lock granularity same as >>> cache line size but that's an implementation decision I guess. >> >> Just make sure you never have multiple locks residing inside the same >> cache line! > > This the the terminology ARM uses when describing their LL/SC > implementation.  It is not the best choice in terminology. >> >>> >>> I do like the idea of detecting potential contention at the >>> start of LL/SC so you can do back off.  Right now the only way I >>> can detect contention is after the fact when the CAS fails and >>> I probably have the cache line exclusive at that point.  It's >>> pretty problematic. >> >> I do prefer LOCK XADD instead of CAS (CmpXchg*), because the return >> value will also tell you which queue entry to pick/work on. >> >> It will not be optimal when really contended, but at least one >> participant will make forward progress, and typically several of them. > > > I'm not aware of any lock-free queue algorithms that use > atomic_fetch_add that are actually lock-free, error free, > and/or don't have an ABA problem.  I'm not saying there > aren't, just that I'm not aware of them. Here is an interesting one I did. A tweak from another algorithm. Basically a bakery algorithm: ______________________________________________ struct cell { uint32_t ver; double state; }; uint32_t head = 0; uint32_t tail = 0; cell cells[N]; // N must be a power of 2 void init() { for (uint32_t i = 0; i < N; ++i) cells[i].ver = i; } void producer(double state) { uint32_t ver = XADD(&head, 1); cell& c = cells[ver & (N - 1)]; while (LOAD(&c.ver) != ver) backoff(); c.state = state; STORE(&c.ver, ver + 1); } double consumer() { uint32_t ver = XADD(&tail, 1); cell& c = cells[ver & (N - 1)]; while (LOAD(&c.ver) != ver + 1) backoff(); double state = c.state; STORE(&c.ver, ver + N); return state; } ______________________________________________