Deutsch English Français Italiano |
<vrs9r3$pnb$3@reader1.panix.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail From: cross@spitfire.i.gajendra.net (Dan Cross) Newsgroups: comp.arch Subject: Re: MSI interrupts Date: Mon, 24 Mar 2025 18:55:31 -0000 (UTC) Organization: PANIX Public Access Internet and UNIX, NYC Message-ID: <vrs9r3$pnb$3@reader1.panix.com> References: <vqto79$335c6$1@dont-email.me> <4603ec2d5082f16ab0588b4b9d6f96c7@www.novabbs.org> <vrrjlp$t52$1@reader1.panix.com> <33d525dc9450a654173f9313f0564224@www.novabbs.org> Injection-Date: Mon, 24 Mar 2025 18:55:31 -0000 (UTC) Injection-Info: reader1.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80"; logging-data="26347"; mail-complaints-to="abuse@panix.com" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: cross@spitfire.i.gajendra.net (Dan Cross) Bytes: 7836 Lines: 162 In article <33d525dc9450a654173f9313f0564224@www.novabbs.org>, MitchAlsup1 <mitchalsup@aol.com> wrote: >On Mon, 24 Mar 2025 12:37:13 +0000, Dan Cross wrote: >> In article <4603ec2d5082f16ab0588b4b9d6f96c7@www.novabbs.org>, >> MitchAlsup1 <mitchalsup@aol.com> wrote: >--------------------- >>>My architecture has a mechanism to perform ATOMIC stuff over multiple >>>instruction time frames that has the property that a higher priority >>>thread which interferers with a lower priority thread, the HPT wins >>>and the LPT fails its ATOMIC event. It is for exactly this reason >>>that I drag priority through the memory hierarchy--so that real-time >>>things remain closer to real-time (no or limited priority inversions). >> >> Without being able to see this in practice, it's difficult to >> speculate as to how well it will actually work in real-world >> scenarios. What is the scope of what's covered by this atomic >> thing? >> >> Consider something as simple as popping an item off of the front >> of a queue and inserting it into an ordered singly-linked list: >> In this case, I'm probably going to want to take a lock on the >> queue, then lock the list, then pop the first element off of the >> queue by taking a pointer to the head. Then I'll set the head >> to the next element of that item, then walk the list, finding >> the place to insert (make sure I'm keeping track of the "prev" >> pointer somehow), then do the dance to insert the item: set the >> item's next to the thing after where I'm inserting, set the >> prev's next to the item if it's not nil, or set the list head >> to the item. Then I unlock the list and then the queue (suppose >> for some reason that it's important that I hold the lock on the >> queue until the element is fully added to the list). > >I have almost that in the documentation for my ATOMIC stuff:: >The following code moves a doubly linked element from one place >in a concurrent data structure to another without any interested >3rd party being able to see that the element is now within the >CDS at any time:: > >BOOLEAN MoveElement( Element *fr, Element *to ) >{ > Element *fn = esmLOCKload( fr->next ); > Element *fp = esmLOCKload( fr->prev ); > Element *tn = esmLOCKload( to->next ); > esmLOCKprefetch( fn ); > esmLOCKprefetch( fp ); > esmLOCKprefetch( tn ); > if( !esmINTERFERENCE() ) > { > fp->next = fn; > fn->prev = fp; > to->next = fr; > tn->prev = fr; > fr->prev = to; > esmLOCKstore( fr->next, tn ); > return TRUE; > } > return FALSE; >} > >The ESM macros simply tell the compiler to set a bit in the LD/ST >instructions. esmINTERFERENCE is a query on the state of the event >and whether a 2nd party has performed a memory action that kills >the event. > >It is required to touch every cache line participating in the event >prior to attempting the ATOMIC update--this is what esmLOCKprefetch >does. One can use up to 8 cache lines in a single ATOMIC event. > >Either everyone in the system sees the update and at the same >instant in time, or no one sees any update at all. > >Technically, there is no one holding a lock ... Suppose I need to get the first two elements of the list. >> This isn't complicated, and for a relatively small list it's not >> expensive. Depending on the circumstances, I'd feel comfortable >> doing this with only a spinlocks protecting both the queue and >> the list. But it's more than a two or three instructions, and >> doesn't feel like something that would fit well into the "may >> fail atomic" model. > >Should be obvious how to change the insertion into a singly linked >list in a different data structure. > >>>Mandating to SW that typ SW must use this mechanism will not fly, >>>at least in the first several iterations of an OS for the system. >>>So the std. methods must remain available. >>> >>>AND to a large extend this sub-thread is attempting to get a >>>usefully large number of them annotated. >> >> It sounds like the whole model is fairly different, and would >> require software writers to consider interrupts occuring in >> places that wouldn't have to in a more conventional design. I >> suspect you'd have a fairly serious porting issue with existing >> systems, but maybe I misunderstand. > >Which is why ID is available. > >>>> A valid response here might be, "don't context switch from the >>>> interrupt handler; use a DPC instead". That may be valid, but >>>> it puts a constraint on the software designer that may be onerus >>>> in practice: suppose the interrupt happens in the scheduler, >>>> while examining a run queue or something. A DPC object must be >>>> available, etc. >>> >>>This seems to be onerous on SW mainly because of too many unknowns >>>to track down and deal with. >> >> Yes. >> >>>> Further, software must now consider the complexity of >>>> potentially interruptable critical sections. From the >>>> standpoint of reasoning about already-complex concurrency issues >>>> it's simpler to be able to assert that (almost) all interrupt >>>> delivery can be cheaply disabled entirely, save for very >>>> special, specific, scenarios like NMIs. Potentially switching >>>> away from a thread holding a spinlock sort of defeats the >>>> purpose of a spinlock in the first place, which is a mutex >>>> primitive designed to avoid the overhead of switching. >>> >>>Agreed. >> >> The idea of priority as described above seems to prize latency >> above other considerations, but it's not clear to me that that >> is the right tradeoff. Exactly what problem is being solved? > >The instruction path into and out of handlers is shorter. Basically >nobody has to look at the schedule queues because HW will do it for >you. For example: GuestOS interrupt dispatcher is approximately:: > >-- +-------------+-----+------------------------+ >R0 | ----------- | ISR | --- MSI-X message ---- | >-- +-------------+-----+------------------------+ > >Dispatcher: > // at this point we are reentrant > SR R7,R0,<39:32> > CMP R2,R7,MaxTable > PLO R2,TTTF > SR R1,R0,<32:0> > CALX [IP,R7<<3,Table-.] > SVR > // if we get here R0<37:32> > // was out of bounds > >This does all of the register file saving and restoring, all of >the DPC/softIRQ stuff, all of the checking for additional ISRs, >IPIs, ... Changes priority, privilege, ROOT pointers and associated >ASIDs from and back to. But again, why is that important? Is that a significant bottleneck for systems software? Is it particularly burdensome for software for some reason? I explicitly don't want the hardware to have that level of control in my scheduling subsystem. This sounds like a solution looking for a problem. - Dan C.