Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <vbrhl3$3fvib$1@dont-email.me>
Deutsch   English   Français   Italiano  
<vbrhl3$3fvib$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>
Newsgroups: comp.arch
Subject: Re: arm ldxr/stxr vs cas
Date: Wed, 11 Sep 2024 00:42:27 -0700
Organization: A noiseless patient Spider
Lines: 72
Message-ID: <vbrhl3$3fvib$1@dont-email.me>
References: <vb4sit$2u7e2$1@dont-email.me>
 <07d60bd0a63b903820013ae60792fb7a@www.novabbs.org>
 <vbc4u3$aj5s$1@dont-email.me>
 <898cf44224e9790b74a0269eddff095a@www.novabbs.org>
 <vbd4k1$fpn6$1@dont-email.me> <vbd91c$g5j0$1@dont-email.me>
 <vbm790$2atfb$2@dont-email.me> <vbr81o$3ekr7$1@dont-email.me>
 <vbrf73$3fb6u$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 11 Sep 2024 09:42:28 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="233a7ecb793af72ad112e5f4147874d3";
	logging-data="3669579"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/TiSV7uYhUSUF7uO0v6swjzs3zaz9pxIY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:1LSl+5QEJxiJmZtopjVjyzVi2u0=
In-Reply-To: <vbrf73$3fb6u$2@dont-email.me>
Content-Language: en-US
Bytes: 4860

On 9/11/2024 12:00 AM, Chris M. Thomasson wrote:
> On 9/10/2024 9:15 PM, Paul A. Clayton wrote:
>> On 9/9/24 3:14 AM, Terje Mathisen wrote:
>>> jseigh wrote:
>>>>
>>>> I'm not so sure about making the memory lock granularity same as
>>>> cache line size but that's an implementation decision I guess.
>>>
>>> Just make sure you never have multiple locks residing inside the same 
>>> cache line!
>>
>> Never?
>>
>> I suspect at least theoretically conditions could exist where
>> having more than one lock within a cache line would be beneficial.
>>
>> If lock B is always acquired after lock A, then sharing a cache
>> line might (I think) improve performance. One would lose
>> prefetched capacity for the data protected by lock A and lock B.
>> This assumes simple locks (e.g., not readers-writer locks).
>>
>> It seems to me that the pingpong problem may be less important
>> than spatial locality depending on the contention for the cache
>> line and the cache hierarchy locality of the contention
>> (pingponging from a shared level of cache would be less
>> expensive).
>>
>> If work behind highly active locks is preferentially or forcefully
>> localized, pingponging would be less of a problem, it seems.
>> Instead of an arbitrary core acquiring a lock's cache line and
>> doing some work, the core could send a message to the natural owner of 
>> the cache line to do the work.
>>
>> If communication between cores was low latency and simple messages
>> used little bandwidth, one might also conceive of having a lock
>> manager that tracks the lock state and sends a granted or not-
>> granted message back. This assumes that the memory location of the
>> lock itself is separate from the data guarded by the lock.
>>
>> Being able to grab a snapshot of some data briefly without
>> requiring (longer-term) ownership change might be useful even
>> beyond lock probing (where a conventional MESI would change the
>> M-state cache to S forcing a request for ownership when the lock
>> is released). I recall some paper proposed expiring cache line
>> ownership to reduce coherence overhead.
>>
>> Within a multiple-cache-line atomic operation/memory transaction,
>> I _think_ if the write set is owned, the read set could be grabbed
>> as such snapshots. I.e., I think any remote write to the read set
>> could be "after" the atomic/transaction commits. (Such might be
>> too difficult to get right while still providing any benefit.)
>>
>> (Weird side-thought: I wonder if a conservative filter might be
>> useful for locking, particularly for writer locks. On the one
>> hand, such would increase the pingpong in the filter when writer
>> locks are set/cleared; on the other hand, reader locks could use
>> a remote increment within the filter check atomic to avoid slight
>> cache pollution.)
> 
> Generally one wants the mutex state to be completely isolated. Padded up 
> to at least a L2 cache line, or if using LL/SC perhaps even a 
> reservation granule... Not only properly padded, but correctly aligned 
> on a L2 cache line or a reservation granule boundary. This helps prevent 
> false sharing and makes life a little better for the underlying 
> architecture...
> 
> You also don't want mutex traffic to interfere with the critical 
> section, or locked region if you will...

Another little trick... Sometimes when we over-allocate and align on a 
large enough boundary, we can steal some bits of the pointers... They 
can be used for fun things indeed... ;^)