| Deutsch English Français Italiano |
|
<vv4rcd$3bbi0$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Robert Finch <robfi680@gmail.com> Newsgroups: comp.arch Subject: Re: DMA is obsolete Date: Sat, 3 May 2025 06:32:44 -0400 Organization: A noiseless patient Spider Lines: 282 Message-ID: <vv4rcd$3bbi0$1@dont-email.me> References: <vuj131$fnu$1@gal.iecc.com> <5a77c46910dd2100886ce6fc44c4c460@www.novabbs.org> <vv19rs$t2d$1@reader1.panix.com> <2025May2.073450@mips.complang.tuwien.ac.at> <vv2mqb$hem$1@reader1.panix.com> <2025May3.081100@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sat, 03 May 2025 12:32:46 +0200 (CEST) Injection-Info: dont-email.me; posting-host="8708c899f85eff90a4ac9e3bad19d5f1"; logging-data="3518016"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/fw6fe7iQ47yi65OOqLmEXENUL3lu4vYw=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:qCe7ty0byb3DrZRD868qX761MdU= Content-Language: en-US In-Reply-To: <2025May3.081100@mips.complang.tuwien.ac.at> Bytes: 15434 On 2025-05-03 2:11 a.m., Anton Ertl wrote: > cross@spitfire.i.gajendra.net (Dan Cross) writes: >> In article <2025May2.073450@mips.complang.tuwien.ac.at>, >> Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote: >>> I think it's the same thing as Greenspun's tenth rule: First you find >>> that a classical DMA engine is too limiting, then you find that an A53 >>> is too limiting, and eventually you find that it would be practical to >>> run the ISA of the main cores. In particular, it allows you to use >>> the toolchain of the main cores for developing them, >> >> These are issues solveable with the software architecture and >> build system for the host OS. > > Certainly, one can work around many bad decisions, and in reality one > has to work around some bad decisions, but the issue here is not > whether "the issues are solvable", but which decision leads to better > or worse consequences. > >> The important characteristic is >> that the software coupling makes architectural sense, and that >> simply does not require using the same ISA across IPs. > > IP? Internet Protocol? Software Coupling sounds to me like a concept > from Constantine out of my Software engineering class. I guess you > did not mean either, but it's unclear what you mean. > > In any case, I have made arguments why it would make sense to use the > same ISA as for the OS for programming the cores that replace DMA > engines. I will discuss your counterarguments below, but the most > important one to me seems to be that these cores would cost more than > with a different ISA. There is something to that, but when the > application ISA is cheap to implement (e.g., RV64GC), that cost is > small; it may be more an argument for also selecting the > cheap-to-implement ISA for the OS/application cores. > >> Indeed, consider AMD's Zen CPUs; the PSP/ASP/whatever it's >> called these days is an ARM core while the big CPUs are x86. >> I'm pretty sure there's an Xtensa DSP in there to do DRAM and >> timing and PCIe link training. > > The PSPs are not programmable by the OS or application programmers, so > using the same ISA would not benefit the OS or application > programmers. By contrast, the idea for the DMA replacement engines is > that they are programmable by the OS and maybe the application > programmers, and that changes whether the same ISA is beneficial. > > What is "ASP/whatever"? > >> Similarly with the ME on Intel. > > Last I read about it, ME uses a core developed by Intel with IA-32 or > AMD64; but in any case, the ME is not programmable by OS or > application programmers, either. > >> A BMC might be running on whatever. > > Again, a BMC is not programmable by OS or application programmers. > >> We increasingly see ARM >> based SBCs that have small RISC-V microcontroller-class cores >> embedded in the SoC for exactly this sort of thing. > > That's interesting; it points to RISC-V being cheaper to implement > than ARM. As for "that sort of thing", they are all not programmable > by OS or application programmers, so see above. > >> Our hardware RoT > > ? > >> The problem is when such service cores are hidden (as they are >> in the case of the PSP, SMU, MPIO, and similar components, to >> use AMD as the example) and treated like black boxes by >> software. It's really cool that I can configure the IO crossbar >> in useful way tailored to specific configurations, but it's much >> less cool that I have to do what amounts to an RPC over the SMN >> to some totally undocumented entity somewhere in the SoC to do >> it. Bluntly, as an OS person, I do not want random bits of code >> running anywhere on my machine that I am not at least aware of >> (yes, this includes firmware blobs on devices). > > Well, one goes with the other. If you design the hardware for being > programmed by the OS programmers, you use the same ISA for all the > cores that the OS programmers program, whereas if you design the > hardware as programmed by "firmware" programmers, you use a > cheap-to-implement ISA and design the whole thing such that it is > opaque to OS programmers and only offers some certain capabilities to > OS programmers. > > And that's not just limited to ISAs. A very successful example is the > way that flash memory is usually exposed to OSs: as a block device > like a plain old hard disk, and all the idiosyncracies of flash are > hidden in the device behind a flash translation layer that is > implemented by a microcontroller on the device. > > What's "SMN"? > >>> and you can also >>> use the facilities of the main cores (e.g., debugging features that >>> may be absent of the I/O cores) during development. >> >> This is interesting, but we've found it more useful going the >> other way around. We do most of our debugging via the SP. >> Since The SP is also responsible for system initialization and >> holding x86 in reset until we're reading for it to start >> running, it's the obvious nexus for debugging the system >> holistically. > > Sure, for debugging on the core-dump level that's useful. I was > thinking about watchpoint and breakpoint registers and performance > counters that one may not want to implement on the DMA-replacement > core, but that is implemented on the OS/application cores. > >>> Marking the binaries that should be able to run on the IO service >>> processors with some flag, and letting the component of the OS that >>> assigns processes to cores heed this flag is not rocket science. >> >> I agree, that's easy. And yet, mistakes will be made, and there >> will be tension between wanting to dedicate those CPUs to IO >> services and wanting to use them for GP programs: I can easily >> imagine a paper where someone modifies a scheduler to move IO >> bound programs to those cores. Using a different ISA obviates >> most of that, and provides an (admittedly modest) security benefit. > > If there really is such tension, that indicates that such cores would > be useful for general-purpose use. That makes the case for using the > same ISA even stronger. > > As for "mistakes will be made", that also goes the other way: With a > separate toolchain for the DMA-replacement ISA, there is lots of > opportunity for mistakes. > > As for "security benefit", where is that supposed to come from? What > attack scenario do you have in mind where that "security benefit" > could materialize? > >> And if I already have to modify or configure the OS to >> accommodate the existence of these things in the first place, >> then accommodating an ISA difference really isn't that much >> extra work. The critical observation is that a typical SMP view >> of the world no longer makes sense for the system architecture, >> and trying to shoehorn that model onto the hardware reality is >> just going to cause frustration. > > The shared-memory multiprocessing view of the world is very > successful, while distributed-memory computers are limited to > supercomputing and other areas where hardware cost still dominates > over software cost (i.e., where the software crisis has not happened > yet); as an example of the lack of success of the distributed-memory > paradigm, take the PlayStation 3; programmers found it too hard to > work with, so they did not use the hardware well, and eventually Sony > decided to go for an SMP machine for the PlayStation 4 and 5. > > OTOH, one can say that the way many peripherals work on > general-purpose computers is more along the lines of > distributed-memory; but that's probably due to the relative hardware > and software costs for that peripheral. Sure, the performance > characteristics are non-uniform (NUMA) in many cases, but 1) caches > tend to smooth over that, and 2) most of the code is not > performance-critical, so it just needs to run, which is easier to > achieve with SMP and harder with distributed memory. > > Sure, people have argued for advantages of other models for decades, > like you do now, but SMP has usually won. > >>>>> On the other hand, you buy a motherboard with said ASIC core, >>>>> and you can boot the MB without putting a big chip in the >>>>> socket--but you may have to deal with scant DRAM since the >>>>> big centralized chip contains teh memory controller. >>>> >>>> A neat hack for bragging rights, but not terribly practical? >>> >>> Very practical for updating the firmware of the board to support the >>> big chip you want to put in the socket (called "BIOS FlashBack" in >>> connection with AMD big chips). >> ========== REMAINDER OF ARTICLE TRUNCATED ==========