| Deutsch English Français Italiano |
|
<859cc676b91aa1173e44acf0f2be636b@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: DMA is obsolete Date: Fri, 2 May 2025 17:40:09 +0000 Organization: Rocksolid Light Message-ID: <859cc676b91aa1173e44acf0f2be636b@www.novabbs.org> References: <vuj131$fnu$1@gal.iecc.com> <da5b3dea460370fc1fe8ad2323da9bc4@www.novabbs.org> <vuvrlr$att$1@reader1.panix.com> <5a77c46910dd2100886ce6fc44c4c460@www.novabbs.org> <vv19rs$t2d$1@reader1.panix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="2846882"; mail-complaints-to="usenet@i2pn2.org"; posting-account="o5SwNDfMfYu6Mv4wwLiW6e/jbA93UAdzFodw5PEa6eU"; User-Agent: Rocksolid Light X-Rslight-Site: $2y$10$S3qGOUW/z9wODvghKzjDy.id7QVuibziQSenukB3/su4EQ8pg.NZu X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Posting-User: cb29269328a20fe5719ed6a1c397e21f651bda71 On Fri, 2 May 2025 2:15:24 +0000, Dan Cross wrote: > In article <5a77c46910dd2100886ce6fc44c4c460@www.novabbs.org>, > MitchAlsup1 <mitchalsup@aol.com> wrote: >>On Thu, 1 May 2025 13:07:07 +0000, Dan Cross wrote: >>> In article <da5b3dea460370fc1fe8ad2323da9bc4@www.novabbs.org>, >>> MitchAlsup1 <mitchalsup@aol.com> wrote: >>>>On Sat, 26 Apr 2025 17:29:06 +0000, Scott Lurndal wrote: >>>>[snip] >>>>Reminds me of trying to sell a micro x86-64 to AMD as a project. >>>>The µ86 is a small x86-64 core made available as IP in Verilog >>>>where it has/runs the same ISA as main GBOoO x86, but is placed >>>>"out in the PCIe" interconnect--performing I/O services topo- >>>>logically adjacent to the device itself. This allows 1ns access >>>>latencies to DCRs and performing OS queueing of DPCs,... without >>>>bothering the GBOoO cores. >>>> >>>>AMD didn't buy the arguments. >>> >>> I can see it either way; I suppose the argument as to whether I >>> buy it or not comes down to, "in depends". How much control do >>> I, as the OS implementer, have over this core? >> >>Other than it being placed "away" from the centralized cores, >>it runs the same ISA as the main cores has longer latency to >>coherent memory and shorter latency to device control registers >>--which is why it is placed close to the device itself:: latency. >>The big fast centralized core is going to get microsecond latency >>from MMI/O device whereas ASIC version will have handful of nano- >>second latencies. So the 5 GHZ core sees ~1 microsecond while the >>little ASIC sees 10 nanoseconds. ... > > Yes, I get the argument for WHY you'd do it, I just want to make > sure that it's an ordinary core (albeit one that is far away > from the sockets with the main SoC complexes) that I interact > with in the usual manner. Compare to, say, MP1 or MP0 on AMD > Zen, where it runs its own (proprietary) firmware that I > interact with via an RPC protocol over an AXI bus, if I interact > with it at all: most OEMs just punt and run AGESA (we don't). > >>> If it is yet another hidden core embedded somewhere deep in the >>> SoC complex and I can't easily interact with it from the OS, >>> then no thanks: we've got enough of those between MP0, MP1, MP5, >>> etc, etc. >>> >>> On the other hand, if it's got a "normal" APIC ID, the OS has >>> control over it like any other LP, and its coherent with the big >>> cores, then yeah, sign me up: I've been wanting something like >>> that for a long time now. >> >>It is just a core that is cheap enough to put in ASICs, that >>can offload some I/O burden without you having to do anything >>other than setting some bits in some CRs so interrupts are >>routed to this core rather than some more centralized core. > > Sounds good. > >>> Consider a virtualization application. A problem with, say, >>> SR-IOV is that very often the hypervisor wants to interpose some >>> sort of administrative policy between the virtual function and >>> whatever it actually corresponds to, but get out of the fast >>> path for most IO. This implies a kind of offload architecture >>> where there's some (presumably software) agent dedicated to >>> handling IO that can be parameterized with such a policy. A >> >>Interesting:: Could you cite any literature, here !?! > > Sure. This paper is a bit older, but gets at the main points: > https://www.usenix.org/system/files/conference/nsdi18/nsdi18-firestone.pdf > > I don't know if the details are public for similar technologies > from Amazon or Google. > >>> core very close to the device could handle that swimmingly, >>> though I'm not sure it would be enough to do it at (say) line >>> rate for a 400Gbps NIC or Gen5 NVMe device. >> >>I suspect the 400 GHz NIC needs a rather BIG core to handle the >>traffic loads. > > Indeed. Part of the challenge for the hyperscalars is in > meeting that demand while not burning too many host resources, > which are the thing they're actually selling their customer in > the first place. A lot of folks are pushing this off to the NIC > itself, and I've seen at least one team that implemented NVMe in > firmware on a 100Gbps NIC, exposed via SR-IOV, as part of a > disaggregated storage architecture. > > Another option is to push this to the switch; things like Intel > Tofino2 were well-position for this, but of course Intel, in its > infinite wisdom and vision, canc'ed Tofino. > >>> ....but why x86_64? It strikes me that as long as the _data_ >>> formats vis the software-visible ABI are the same, it doesn't >>> need to use the same ISA. In fact, I can see advantages to not >>> doing so. >> >>Having the remote core run the same OS code as every other core >>means the OS developers have fewer hoops to jump through. Bug-for >>bug compatibility means that clearing of those CRs just leaves >>the core out in the periphery idling and bothering no one. > > Eh...Having to jump through hoops here matters less to me for > this kind of use case than if I'm trying to use those cores for > general-purpose compute. Having a separate ISA means I cannot > accidentally run a program meant only for the big cores on the > IO service processors. As long as the OS has total control over > the execution of the core, and it participates in whatever cache > coherency scheme the rest of the system uses, then the ISA just > isn't that important. > >>On the other hand, you buy a motherboard with said ASIC core, >>and you can boot the MB without putting a big chip in the >>socket--but you may have to deal with scant DRAM since the >>big centralized chip contains teh memory controller. > > A neat hack for bragging rights, but not terribly practical? > > Anyway, it's a neat idea. It's very reminiscent of IBM channel > controllers, in a way. It is more like the Peripheral Processors of CDC 6600 that run ISA of a CDC 6600 without as much fancy execution in periphery. > - Dan C.