Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <0faeb2831a76d32cd6fb8cff7b546807@www.novabbs.org>
Deutsch   English   Français   Italiano  
<0faeb2831a76d32cd6fb8cff7b546807@www.novabbs.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: PCIe MSI-X interrupts
Date: Sat, 22 Jun 2024 01:12:50 +0000
Organization: Rocksolid Light
Message-ID: <0faeb2831a76d32cd6fb8cff7b546807@www.novabbs.org>
References: <bb16865f7675526d4e2b87283e28c2c5@www.novabbs.org> <sKmdO.62321$G9_a.28048@fx13.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="760438"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
X-Rslight-Site: $2y$10$zJEDeWbpMhjd3PgTiQGp1es.mgr9MdvPdn2XCdTFs5dWAdb/o3cZ6
X-Spam-Checker-Version: SpamAssassin 4.0.0
Bytes: 17430
Lines: 399

Scott Lurndal wrote:

First of all, allow me to express my gratitude in such a well
though out response, compared to the miscellaneous ramblings
going on in my head.

> mitchalsup@aol.com (MitchAlsup1) writes:
>>PCIe has an MSI-X interrupt 'capabillity' which consists of
>>a number (n) interrupt desctiptors and an associated Pending
>>Bit Array where each bit in PBA has a corresponding 128-bit
>>desctiptor. A descriptor contains a 64-bit address, a 32-bit
>>message, and a 32-bit vector control word. 
>>
>>There are 2-levels of enablement, one at the MSI-X configura-
>>tion control register and one in each interrupt descriptor at
>>vector control bit[31].
>>
>>As the device raises an interrupt, it sets a bit in PBA.
>>
>>When MSI-X is enabled and a bit in PBA is set (1) and the
>>vector control bit[31] is enabled, the device sends a
>>write of the message to the address in the descriptor,
>>and clears the bit in PBA.

> Note that if the interrupt condition is asserted after the
> global enable in the MSI-X capability and the vector enable
> have both been set to allow delivery, the message will be sent to
> the root complex and PBA will not be updated.   (P is for
> pending, and once the message is sent, it's no longer
> pending).  PBA is only updated when the interrupt is masked
> (either function-wide in the capability or per-vector).

So, the interrupt only becomes pending in BPA if it cannot be 
sent immediately. Thanks for the clarification.

>>
>>I am assuming that the MSI-X enable bit is used to throttle

> In my experience the MSI-X function enable and vector enables
> are not modified during runtime, rather the device has control
> registers which allow masking of the interrupt (e.g.
> for AHCI, the MSI message will only be sent if the port
> PxIE (Port n Interrupt Enable) bit corresponding to a
> PxIS (Port n Interrupt Status) bit is set).

So, these degenerated into more masking levels that are not
used very often because other masks can be applied elsewhere.

> Granted, AHCI specifies MSI, not MSI-X, but every MSI-X
> device I've worked with operates the same way, with
> device specific interrupt enables for a particular vector.

>>a device so that it sends bursts of interrupts to optimize
>>the caching behavior of the cores handling the interrupts.
>>run applications->handle k interrupts->run applications.
>>A home machine would not use this featrue as the interrupt
>>load is small, but a GB server might more control over when.
>>But does anybody know ??

> Yes, we use MSI-X extensively.  See above.

> There are a number of mechanisms used for interrupt moderation,
> but all generally are independent of the PCI message delivery.
> (e.g. RSS spreads interrupts across multiple target cores,
>  or the Intel 10Ge network adapters interrupt moderation feature).

>>
>>a) device dommand to interrupt descriptor mapping {
>>Thre is no mention of the mapping of commands to the device
>>and to these interrupt descriptors. Can anyone supply input
>>or pointers to this mapping. 

> Once the message leaves the device, is received by the
> root complex port and is forwarded across the host bridge
> to the system fabric, it's completely under control of
> the host.   On x86, the TLP for the upstream message is
> received and forwarded to the specified address (which is
> the IOAPIC on Intel and the GIC ITS on Arm64).

> The interrupt controller may further mask the interrupt if
> desired or if the interrupt priority is lower than the
> current running priority.

{note to self:: that is why its a local APIC--it has to be close
enough to see the core's priority.}

Question:: Down below you talk of the various interrupt control-
lers routing an interrupt <finally> to a core. What happens if the 
core has changed its priority by the time the interrupt signal 
arrives, but before it can change the state of the tables in the
interrupt controller that routed said interrupt here ?

>>
>>A single device (such as a SATA drive) might have a queue of
>>outstanding commands that it services in whatever order it
>>thinks best. Many of these commands want to inform some core
>>when the command is complete (or cannot be completed). To do
>>this, device sends a stored interrupt messages to the stored 
>>service port.

> Each SATA port has an PxIS and PxIE register.   The SATA (AHCI)
> controller
> MSI configuration can provide one vector per port - the main
> difference between MSI and MSI-X is that the interrupt numbers
> for MSI must be consecutive and there is only one address;
> while for MSI-X each vector has an unique address and a programmable
> data (interrupt number) field.   The interpretation of the data
> of the MSI-X or MSI upstream write is up to the interrupt controller
> and may be virtualized in the interrupt controller.

I see (below) that you (they) migrated all the stuff I though might
be either in the address or data to the "other side" of HostBridge.
Fair enough.

For what reason are there multiple addresses ? instead of a range
of addresses providing a more globally-scoped service port ?
Perhaps it is an address at the interrupt descriptor, and an
address range at the global interrupt controller. Where different
addresses then mean different things.

> Note that support for MSI in AHCI is optional (in which case the
> legacy level sensitive PCI INTA/B/C/D signals are used).

> The AHCI standard specification (ahci_1_3.pdf) is available publically.

>>}
>>I don't really NEED to know this mapping, but knowing would
>>significantly enhance my understanding of what is supposed 
>>to be going on, and thus avoid making crippling errors.
>>
>>b) address space of interrupt service port {
>>The address in the interrupt descriptor points at a service 
>>port (APIC). Since a service port is "not like memory"*, I
>>want to mandate this aqddress be in MMI/O space, and since 
>>My 66000 has a full 64-bit address space for MMI/O there is 
>>no burden on the size of MMI/O space--it is already as big
>>as possible on a 64-bit machine. Plus, MMI/O space has the 
>>property of being sequentially consistent whereas DRAM is
>>only cache consistent.

> From the standpoint of the PCIexpress root port, the upstream write
> generated by the device to send the MSI message to the host
> looks just like any other inbound DMA from the device to the
> host.   It is the responsibility of the host bridge and interconnect to
> route the message the  appropriate destination (which generally
> is an interrupt controller, but just as legally could be a
> DRAM address which software polls periodically).

So the message arrive at the top of the PCIe tree is RAW, then
the address gets translated by I/O MMU, and both translated 
address and RAW data are passed forward to its fate.

>>
>>Most current architectures just partition a hunk of the 
>>physical address space as MMI/O address space.

> The address field in the MSI-X vector (or MSI-X capability)
> is opaque to hardware below the PCIe root port.

> Our chips recognize the interrupt controller range of
> addresses in the inbound message at the host bridge
> and route the message to the interrupt translation service;
> the destinations in the interrupt controller are simply
> control and status registers in the MMIO space.   The
> ARM64 interrupt controller supports multiple destinations
> with different semantics (SPI and xSPI have one target
> register and LPI has a different target register the address
> of which is programmed into the MSI-X Vector address field).

What I am trying to do is to figure out a means to route the
message to a virtual core's interrupt table such that:: if that
virtual core happens to be running on any physical core, that
the physical core sees the interrupt without delay, and if
the virtual core is not running, the event is properly logged
so when the virtual core runs on a physical core that those
ISRs are performed before any lower priority work is performed.

{and make this work for any number of physical cores and any
number of virtual cores; where cores can sharing interrupt 
========== REMAINDER OF ARTICLE TRUNCATED ==========