Deutsch   English   Français   Italiano  
<vbd91c$g5j0$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jseigh <jseigh_es00@xemaps.com>
Newsgroups: comp.arch
Subject: Re: arm ldxr/stxr vs cas
Date: Thu, 5 Sep 2024 17:49:32 -0400
Organization: A noiseless patient Spider
Lines: 67
Message-ID: <vbd91c$g5j0$1@dont-email.me>
References: <vb4sit$2u7e2$1@dont-email.me>
 <07d60bd0a63b903820013ae60792fb7a@www.novabbs.org>
 <vbc4u3$aj5s$1@dont-email.me>
 <898cf44224e9790b74a0269eddff095a@www.novabbs.org>
 <vbd4k1$fpn6$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 05 Sep 2024 23:49:33 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="b906cc1d9c010c5924a8284161c2d580";
	logging-data="530016"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX193c6m1MAQmN59OK9tp2VrK"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Lx9ygCJG+AEAxfiqE+ezyHAYkPk=
Content-Language: en-US
In-Reply-To: <vbd4k1$fpn6$1@dont-email.me>
Bytes: 4278

On 9/5/24 16:34, Chris M. Thomasson wrote:
> On 9/5/2024 12:46 PM, MitchAlsup1 wrote:
>> On Thu, 5 Sep 2024 11:33:23 +0000, jseigh wrote:
>>
>>> On 9/4/2024 5:27 PM, MitchAlsup1 wrote:
>>>> On Mon, 2 Sep 2024 17:27:57 +0000, jseigh wrote:
>>>>
>>>>> I read that arm added the cas instruction because they didn't think
>>>>> ldxr/stxr would scale well.  It wasn't clear to me as to why that
>>>>> would be the case.  I would think the memory lock mechanism would
>>>>> have really low overhead vs cas having to do an interlocked load
>>>>> and store.  Unless maybe the memory lock size might be large
>>>>> enough to cause false sharing issues.  Any ideas?
>>>>
>>>> A pipeline lock between the LD part of a CAS and the ST part of a
>>>> CAS is essentially FREE. But the same is true for LL followed by
>>>> a later SC.
>>>>
>>>> Older machines with looser than sequential consistency memory models
>>>> and running OoO have a myriad of problems with LL - SC. This is
>>>> why My 66000 architecture switches from causal consistency to
>>>> sequential consistency when it encounters <effectively> LL and
>>>> switches bac after seeing SC.
>>>>
>>>> No Fences necessary with causal consistency.
>>>>
>>>
>>> I'm not sure I entirely follow.  I was thinking of the effects on
>>> cache.  In theory the SC could fail without having get the current
>>> cache line exclusive or at all.  CAS has to get it exclusive before
>>> it can definitively fail.
>>
>> A LL that takes a miss in L1 will perform a fetch with intent to modify,
>> so will a CAS. However, LL is allowed to silently fail if exclusive is
>> not returned from its fetch, deferring atomic failure to SC, while CAS
>> will fail when exclusive fails to return. 
> 
> CAS should only fail when the comparands are not equal to each other. 
> Well, then there is the damn weak and strong CAS in C++11... ;^o
> 
> 
>> LL-SC is designed so that
>> when a failure happens, failure is visible at SC not necessarily at LL.
>>
>> There are coherence protocols that allows the 2nd party to determine
>> if it returns exclusive or not. The example I know is when the 2nd
>> party is already performing an atomic event and it is better to fail
>> the starting atomic event than to fail an ongoing atomic event.
>> In My 66000 the determination is made under the notion of priority::
>> the higher priority thread is allows to continue while the lower
>> priority thread takes the failure. The higher priority thread can
>> be the requestor (1st party) or the holder of data (2nd party)
>> while all interested observers (3rd parties) are in a position
>> to see what transpired and act accordingly (causal).
>>

I'm not so sure about making the memory lock granularity same as
cache line size but that's an implementation decision I guess.

I do like the idea of detecting potential contention at the
start of LL/SC so you can do back off.  Right now the only way I
can detect contention is after the fact when the CAS fails and
I probably have the cache line exclusive at that point.  It's
pretty problematic.

Joe Seigh