Article <ve2k4n$23km8$1@dont-email.me>

Deutsch English Français Italiano
<ve2k4n$23km8$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>
Newsgroups: comp.arch
Subject: Re: is Vax addressing sane today
Date: Mon, 7 Oct 2024 23:40:22 -0700
Organization: A noiseless patient Spider
Lines: 112
Message-ID: <ve2k4n$23km8$1@dont-email.me>
References: <vdg3d1$2kdqr$1@dont-email.me>
 <memo.20241001101211.19028o@jgd.cix.co.uk>
 <20241001123426.000066c1@yahoo.com>
 <2024Oct1.182625@mips.complang.tuwien.ac.at> <vdknel$3e4pf$9@dont-email.me>
 <2024Oct3.085754@mips.complang.tuwien.ac.at> <vdne1a$3uaeh$4@dont-email.me>
 <m1rufjhpi09m9adedt87nrcdfmij1i8pvb@4ax.com> <vdo2ct$4les$1@dont-email.me>
 <vdpg4a$atqh$16@dont-email.me>
 <bd0089399bca4042ed96f1ec49956b0d@www.novabbs.org>
 <vdscmj$tbdp$1@dont-email.me>
 <f28dc860209d3aa58d85830011785290@www.novabbs.org>
 <ve1h90$1r206$11@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 08 Oct 2024 08:40:24 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="d938895af8a55dcdf6009a32333c8ef4";
	logging-data="2216648"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+1WwMWKlQJd4QdrCpqlRqasq5HRVMOX6E="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:89X3lJWa9VmMZpQWSHvbO5esUbw=
Content-Language: en-US
In-Reply-To: <ve1h90$1r206$11@dont-email.me>
Bytes: 5589

On 10/7/2024 1:45 PM, Chris M. Thomasson wrote:
> On 10/6/2024 4:33 PM, MitchAlsup1 wrote:
>> On Sat, 5 Oct 2024 21:56:34 +0000, Chris M. Thomasson wrote:
>>
>>> On 10/4/2024 3:54 PM, MitchAlsup1 wrote:
>>>> On Fri, 4 Oct 2024 19:36:41 +0000, Chris M. Thomasson wrote:
>>>>
>>>>> On 10/3/2024 11:36 PM, Chris M. Thomasson wrote:
>>>>>> On 10/3/2024 9:23 PM, George Neuner wrote:
>>>>>>> On Fri, 4 Oct 2024 00:48:43 -0000 (UTC), Lawrence D'Oliveiro
>>>>>>> <ldo@nz.invalid> wrote:
>>>>>>>
>>>>>>>> On Thu, 03 Oct 2024 06:57:54 GMT, Anton Ertl wrote:
>>>>>>>>
>>>>>>>>> If the RISC companies failed to keep up, they only have 
>>>>>>>>> themselves to
>>>>>>>>> blame.
>>>>>>>>
>>>>>>>> That’s all past history, anyway. RISC very much rules today, and it
>>>>>>>> is x86
>>>>>>>> that is struggling to keep up.
>>>>>>>
>>>>>>> You are, of course, aware that the complex "x86" instruction set 
>>>>>>> is an
>>>>>>> illusion and that the hardware essentially has been a load-store 
>>>>>>> RISC
>>>>>>> with a complex decoder on the front end since the Pentium Pro landed
>>>>>>> in 1995.
>>>>>>
>>>>>> Yeah. Wrt memory barriers, one is allowed to release a spinlock on 
>>>>>> "x86"
>>>>>> with a simple store.
>>>>>
>>>>> The fact that one can release a spinlock using a simple store means 
>>>>> that
>>>>> its basically load-acquire release-store.
>>>>>
>>>>> So a load will do a load then have an implied acquire barrier.
>>>>>
>>>>> A store will do an implied release barrier then perform the store.
>>>>
>>>> How does the store know it needs to do this when the locking
>>>> instruction is more than a pipeline depth away from the
>>>> store release ?? So, Locked LD (or something) happens at
>>>> 1,000,000 cycles, and the corresponding store happens at
>>>> 10,000,000 cycles (9,000,000 locked).
>>>>
>>>>> This release behavior is okay for releasing a spinlock with a simple
>>>>> store, MOV.
>>>>
>>>> It may be OK to SW but it causes all kinds of grief to HW.
>>>
>>> I thought that x86 has an implied #LoadStore | #StoreStore before the
>>> store, basically to give it release semantics. This means that one can
>>> release a spinlock without using any explicit membars. Iirc, there are
>>> Intel manuals that show this for spinlocks. Cannot exactly remember
>>> right now.
>>
>> I wonder if this actually works with my scenario above.
> 
> Well, using a MOV to unlock a spinlock works on x86. You do not need to 
> use a LOCK'ed RMW, or any of the fences S/L/MFENCE instructions. 
> However, implementing something like petersons algorithm needs a damn 
> #StoreLoad, ala LOCK'ed RMW or a MFENCE on x86. This is for the locking 
> stage, not the unlocking. SMR also needs a #StoreLoad, unless you are 
> using some asymmetric membar, ala SMR+RCU or something.
> 
> On sparc in RMO mode, we need to use a release barrier (#LoadStore | 
> #StoreStore) before the store instruction that actually unlocks the mutex.
> 
> pseudo code to release a spinlock:
> 
> mb_release();
> store();
> 
> This is already implied on x86 wrt MOV.
> 
> Note that a release barrier does not use a #StoreLoad.

For a spinlock its basically like:

atomic_lock() // can be XCHG or something to lock it...
   mb_acquire() // #LoadStore | #LoadLoad

     // critical section

   mb_release() // #LoadStore | #StoreStore
atomic_store()

On x86 the atomic_store already has the #LoadStore | #StoreStore in the 
right place in order for it it be TSO at all.





> 
> 
>>> On x86 an atomic load has acquire and atomic stores have release
>>> semantics. Well, I think that is for WB memory only. Humm... Cannot
>>> remember if its for WC or WB memory right now. Then there are the
>>> L/S/MFENCE instructions...
>>>
>>> https://www.felixcloutier.com/x86/sfence
> 
> Fwiw, an asymmetric membar on windows:
> 
> https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/ 
> nf-processthreadsapi-flushprocesswritebuffers
> 
> ;^)