Article <vbpme1$30k42$2@dont-email.me>

Deutsch English Français Italiano
<vbpme1$30k42$2@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!2.eu.feeder.erje.net!feeder.erje.net!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jseigh <jseigh_es00@xemaps.com>
Newsgroups: comp.arch
Subject: Re: arm ldxr/stxr vs cas
Date: Tue, 10 Sep 2024 10:51:45 -0400
Organization: A noiseless patient Spider
Lines: 66
Message-ID: <vbpme1$30k42$2@dont-email.me>
References: <vb4sit$2u7e2$1@dont-email.me>
 <07d60bd0a63b903820013ae60792fb7a@www.novabbs.org>
 <vbc4u3$aj5s$1@dont-email.me>
 <898cf44224e9790b74a0269eddff095a@www.novabbs.org>
 <vbd4k1$fpn6$1@dont-email.me> <vbd91c$g5j0$1@dont-email.me>
 <vbm790$2atfb$2@dont-email.me> <vbn0o6$2ed30$1@dont-email.me>
 <vbp33u$2sr79$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 10 Sep 2024 16:51:45 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="962e6e0f08d33e3da6320defbc27b398";
	logging-data="3166338"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19MjzfVgH7wGksx3JPhpYNX"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:tRoevLpBP8od80o9j/KqeaWw25U=
Content-Language: en-US
In-Reply-To: <vbp33u$2sr79$1@dont-email.me>
Bytes: 3956

On 9/10/24 05:22, Terje Mathisen wrote:
> jseigh wrote:
>> On 9/9/24 03:14, Terje Mathisen wrote:
>>> jseigh wrote:
>>>>
>>>> I'm not so sure about making the memory lock granularity same as
>>>> cache line size but that's an implementation decision I guess.
>>>
>>> Just make sure you never have multiple locks residing inside the same 
>>> cache line!
>>
>> This the the terminology ARM uses when describing their LL/SC
>> implementation.  It is not the best choice in terminology.
>>>
>>>>
>>>> I do like the idea of detecting potential contention at the
>>>> start of LL/SC so you can do back off.Â  Right now the only way I
>>>> can detect contention is after the fact when the CAS fails and
>>>> I probably have the cache line exclusive at that point.Â  It's
>>>> pretty problematic.
>>>
>>> I do prefer LOCK XADD instead of CAS (CmpXchg*), because the return 
>>> value will also tell you which queue entry to pick/work on.
>>>
>>> It will not be optimal when really contended, but at least one 
>>> participant will make forward progress, and typically several of them.
>>
>>
>> I'm not aware of any lock-free queue algorithms that use
>> atomic_fetch_add that are actually lock-free, error free,
>> and/or don't have an ABA problem.  I'm not saying there
>> aren't, just that I'm not aware of them.
> 
> I'm not sure either: I have written test code that should be general 
> purpose but never actually used that in any production systems.
> 
> The one time I used this for critical work (a maximum throughput ntpd 
> server) I had a single writer and N-1 readers, so I actually decided to 
> create one queue per reader (cpu core), making them much easier to get 
> right. :-)
> 
> Each queue was a power of two in length, the write and read indices were 
> full 32-bit unsigned variables that I masked before accessing the work 
> list, so I never needed to worry about queue end wraparound.
> 
> The writer simply allocated work to each queue in round robin fashion.
> 

You should look at the RSEQ (restartable sequences) code for SPMC
queue.  It's in assembler (basically because RSEQ needs to know
the exact address of the commit instruction), but essentially an
enqueue operation is

	tail->next = &added_node;
	tail = &added_node;		// the commit instruction

If you get interrupted before the tail gets updated, no matter
because the next successful enqueue will work fine.  You
don't even have to null the next pointer of the node you
are enqueuing.

The only downside is that if you are on a thousand processor
box, you will have 1000 SPMC queues.

Joe Seigh