Article <vs41ap$n43$1@reader1.panix.com>

Deutsch English Français Italiano
<vs41ap$n43$1@reader1.panix.com>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder9.news.weretis.net!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail
From: cross@spitfire.i.gajendra.net (Dan Cross)
Newsgroups: comp.arch
Subject: Re: MSI interrupts
Date: Thu, 27 Mar 2025 17:19:21 -0000 (UTC)
Organization: PANIX Public Access Internet and UNIX, NYC
Message-ID: <vs41ap$n43$1@reader1.panix.com>
References: <vqto79$335c6$1@dont-email.me> <b1c74762d01f71cc1b8ac838dcf6d4fa@www.novabbs.org> <vrvukp$n29$1@reader1.panix.com> <7a093bbb356e3bda3782c15ca27e98a7@www.novabbs.org>
Injection-Date: Thu, 27 Mar 2025 17:19:21 -0000 (UTC)
Injection-Info: reader1.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80";
	logging-data="23683"; mail-complaints-to="abuse@panix.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: cross@spitfire.i.gajendra.net (Dan Cross)
Bytes: 21512
Lines: 529

In article <7a093bbb356e3bda3782c15ca27e98a7@www.novabbs.org>,
MitchAlsup1 <mitchalsup@aol.com> wrote:
>On Wed, 26 Mar 2025 4:08:57 +0000, Dan Cross wrote:
>> [snip]
>> 1. `from` may point to a cache line
>> 2. `to` may point to a different cache line.
>> 3. Does your architecture use a stack?
>
>No

You make multiple references to a stack just below.  I
understand stack references are excluded from the atomic event,
but I asked whether the architecture uses a stack.  Does it, or
doesn't it?

>>    What sort of alignment criteria do you impose?
>
>On the participating data none. On the cache line--cache line.

I meant for data on the stack.  The point is moot, though, since
as you said stack data does not participate in the atomic event.

>>    At first blush, it seems to me that
>>    the pointers `from_next`, `from_prev`, and `to_next` could be
>>    on the stack and if so, will be on at least one cache line,
>>    and possibly 2, if the call frame for `MoveElement` spans
>>    more than one.
>
>The illustrated code is using 6 participating cache lines.
>Where local variables are kept (stack, registers) does not
>count against the tally of participating cache lines.

Huh.  How would this handle something like an MCS lock, where
the lock node may be on a stack, though accessible globally in
some virtual address space?

>>                  Of course, it's impossible to say when this
>>    is compiled, but maybe you require that the stack is always
>>    aligned to a cache line boundary or something.
>
>Data stack is DoubleWord aligned at all times.

Ok.  So just for context, the point of this question was to
understand whether local data participating in one of these
atomic events could span multiple cache lines.  Since they
don't participate it doesn't matter vis the current discussion.

>>    these are all in registers and it's fine.
>
>Local variables are not participating in the event--they are just
>around to guide the event towards conclusion.

Ok.

>> 4. Depending on the size and alignment criteria of the the
>>    `Element` structure and the placement of the `next` and
>>    `prev` elements in the struct (unknowable from this example),
>>    `to->next` may be on another cache line.
>> 5. Similarly, `from->next`
>> 6. And `from->prev`
>> 7. And `from_prev->next`
>> 8. And `from_next->prev`
>> 9. And `to_next->prev`.
>>
>> So potentially this code _could_ touch 9 or 10 cache lines, more
>> than you said are supported for a single atomic event.  What
>> happens in that case?  Does the code just always return false?
>> The programmer best make sure the optimizer is keeping those
>> temporary pointers off the stack; good luck with a debug build.
>>
>> Anyway, let's assume that's not the case, and we don't hit the
>> pathological case with cache lines, and we're not dealing with
>> edge cases at the front or end of the lists (so various next or
>> prev pointers are not NULL).  Why don't I try to explain what I
>> think this code is doing, and you tell me whether I'm right or
>> wrong.
>>
>> If I understand this correctly, the `esmLOCKload`s are simply
>> loads, but they make sure that the compiler emits instructions
>> that tie into your atomic support framework.
>
>So far so good.
>
>>                                              Earlier you wrote,
>> "esmINTERFERENCE is a query on the state of th event and whether
>> a 2nd party has performed a memory action that kills the event."
>> Ok, I take this to mean _at the time of the store_, which
>> appears later in the program.  If it were simply a query at the
>> time it was written in the code, then it seems like it opens a
>> TOCTOU bug (what if something interfers after the check, but
>> before the the locked store at the end?).  Or perhaps it signals
>> to the architecture that this is the point to which the
>> processor should rollback, if anything subsequently fails; sort
>> of a `setjmp` kind of thing that conceptually returns twice.
>
>The "if( esmINTERFERENCE() )" not only queries the state of the
>participating cache lines, it sets up a guarding-shadow over
>the STs to follow such that if killing-interference is detected
>none of the STs are performed AND control is transferred to the
>branch label (outside of the event). So not only does it query
>the state at it execution, it monitors that over the rest of
>the event.
>
>BUT ALSO: it initializes the ability to NAK SNOOPs. The thought
>is that if we have reached the manifestation phase of the event
>(that is ready to do all the stores) then we should be able to
>drive the event forward by not sending the data in response to
>the SNOOP. We send a Nak (based on priority or the data)
>
>If the SNOOPing core receives a NaK and it is not attempting an
>event the request is simply reissued. If it is attempting, then
>its event fails. So a core benignly accessing data that another
>core is attempting an event upon, is delayed by an interconnect
>round trip--but if it is attempting, it knows its event will
>fail and it will either retry from before the event started,
>or it will goto the interferrence-label where other facilities
>\are at its disposal.

Ok.

>> Assuming nothing has poisoned the event, the various pointer
>> stores inside the conditional are conditionally performed; the
>> `esmLOCKstore` to `from->next` commits the event, at which
>> point all side effects are exposed and durable.
>
>From a SW perspective, that is accurate enough.
>
>From a HW perspective, the participating STs are performed and
>held in a buffer, and when it is known the event has succeeded,
>a bit is flipped and these writes to memory all appear to have
>been performed in the same cycle. One cycle earlier and no
>interested 2nd party has seen any of them, the subsequent cycle
>any interested 2nd party will see them all.
>
>I should also note: ESM works even when there are no caches in
>the system. ...

Ok, good to go.

>> Is that correct?  Let's suppose that it is.
>>
>> But later you wrote,
>>
>>>So, if you want the property whereby the lock disappears on any
>>>control transfer out of the event {exception, interrupt, SVC, SVR, ...};
>>>then you want to use my ATOMIC stuff; otherwise, you can use the
>>>normal ATOMIC primitives everyone and his brother provide.
>>
>> Well, what precisely do you mean here when you say, "if you want
>> the property whereby the lock disappears on any control transfer
>> out of the event"?
>
>If you want that property--you use the tools at hand.
>If you don't just use them as primitive generators.

I wasn't clear enough here.  I'm asking what, exactly, you mean
by this _property_, not what you mean when you write that one
can use these atomic events if one wants the property.  That is,
I'm asking you to precisely define what it means for a lock to
"disappear".

It seems clear enough _now_ that once the event concludes
successfully the lock value is set to whatever it was set to in
during the event, but that wasn't clear to me earlier and I
wanted confirmation.

In particular, it seems odd to me that one would bother with a
lock of some kind during an event if it didn't remain set after
the event.  If you do all of your critical section stuff inside
of the event, and setting the lock in the event is not visible
until the event concludes, why bother?  If on the other hand you
use the event to set the lock, why bother doing additional work
inside the event itself, but in this case I definitely don't
want something else to come along and just unset the lock on its
own, higher priority or not.

>> What I _hope_ you mean is that if you transfer "out of the
>> event" before the `esmLOCKstore` then anything that's been done
>> since the "event" started is rolled back, but NOT if control
>> transfer happens _after_ the `esmLOCKstore`.  If THAT were the
>> case, then this entire mechanism is useless.
>
>One can use a esmLOCKstore( *SP, 0 ) to abandon an event. HW
>detects that *SP is not participating in the event, and uses
========== REMAINDER OF ARTICLE TRUNCATED ==========