Article <vbplo5$30k42$1@dont-email.me>

Deutsch English Français Italiano
<vbplo5$30k42$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jseigh <jseigh_es00@xemaps.com>
Newsgroups: comp.arch
Subject: Re: arm ldxr/stxr vs cas
Date: Tue, 10 Sep 2024 10:40:05 -0400
Organization: A noiseless patient Spider
Lines: 75
Message-ID: <vbplo5$30k42$1@dont-email.me>
References: <vb4sit$2u7e2$1@dont-email.me>
 <07d60bd0a63b903820013ae60792fb7a@www.novabbs.org>
 <vbc4u3$aj5s$1@dont-email.me>
 <898cf44224e9790b74a0269eddff095a@www.novabbs.org>
 <vbd4k1$fpn6$1@dont-email.me> <vbd91c$g5j0$1@dont-email.me>
 <vbm790$2atfb$2@dont-email.me> <vbn0o6$2ed30$1@dont-email.me>
 <vbno6u$2hvrc$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 10 Sep 2024 16:40:06 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="962e6e0f08d33e3da6320defbc27b398";
	logging-data="3166338"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/fiTC+bfy3NJpg5LGJNbJE"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:9LXj4aBBeCSQKVpprXV8mjY8LxY=
Content-Language: en-US
In-Reply-To: <vbno6u$2hvrc$2@dont-email.me>
Bytes: 3963

On 9/9/24 17:09, Chris M. Thomasson wrote:
> On 9/9/2024 7:29 AM, jseigh wrote:
>> On 9/9/24 03:14, Terje Mathisen wrote:
>>> jseigh wrote:
>>>>
>>>> I'm not so sure about making the memory lock granularity same as
>>>> cache line size but that's an implementation decision I guess.
>>>
>>> Just make sure you never have multiple locks residing inside the same 
>>> cache line!
>>
>> This the the terminology ARM uses when describing their LL/SC
>> implementation.  It is not the best choice in terminology.
>>>
>>>>
>>>> I do like the idea of detecting potential contention at the
>>>> start of LL/SC so you can do back off.  Right now the only way I
>>>> can detect contention is after the fact when the CAS fails and
>>>> I probably have the cache line exclusive at that point.  It's
>>>> pretty problematic.
>>>
>>> I do prefer LOCK XADD instead of CAS (CmpXchg*), because the return 
>>> value will also tell you which queue entry to pick/work on.
>>>
>>> It will not be optimal when really contended, but at least one 
>>> participant will make forward progress, and typically several of them.
>>
>>
>> I'm not aware of any lock-free queue algorithms that use
>> atomic_fetch_add that are actually lock-free, error free,
>> and/or don't have an ABA problem.  I'm not saying there
>> aren't, just that I'm not aware of them.
> 
> Here is an interesting one I did. A tweak from another algorithm. 
> Basically a bakery algorithm:
> 
> <pseudo code, membars aside for a moment>
> ______________________________________________
> struct cell { uint32_t ver; double state; };
> 
> uint32_t head = 0;
> uint32_t tail = 0;
> cell cells[N]; // N must be a power of 2
> 
> void init() {
>      for (uint32_t i = 0; i < N; ++i) cells[i].ver = i;
> }
> 
> void producer(double state) {
>      uint32_t ver = XADD(&head, 1);
>      cell& c = cells[ver & (N - 1)];
>      while (LOAD(&c.ver) != ver) backoff();
>      c.state = state;
>      STORE(&c.ver, ver + 1);
> }
> 
> double consumer() {
>      uint32_t ver = XADD(&tail, 1);
>      cell& c = cells[ver & (N - 1)];
>      while (LOAD(&c.ver) != ver + 1) backoff();
>      double state = c.state;
>      STORE(&c.ver, ver + N);
>      return state;
> }
> ______________________________________________


One of the things I have in my implementation is that an enqueue
operation can detect if wrap occurred if it got a interrupted by
a time slice end and log how far behind it got.  I'm seeing about
200,000 enqueue operations in those cases.  That would have been
a huge performance hit if my queue wasn't lock-free.

Joe Seigh