Path: ...!weretis.net!feeder9.news.weretis.net!news.quux.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: comp.arch Subject: Re: Is Parallel Programming Hard, And, If So, What Can You Do About It? Date: Mon, 12 May 2025 21:50:02 -0400 Organization: A noiseless patient Spider Lines: 72 Message-ID: References: <0ec5d195f4732e6c92da77b7e2fa986d@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Date: Tue, 13 May 2025 03:50:08 +0200 (CEST) Injection-Info: dont-email.me; posting-host="d1a63527bedce0b2b45e413598f24fe1"; logging-data="1492577"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18QiJVFSQ8WjTcb19yHb+yujPTVtwCXZtg=" User-Agent: Gnus/5.13 (Gnus v5.13) Cancel-Lock: sha1:U7McsnjJspFHYVJISEZYTrtOxck= sha1:2gMTOU3xSDtBdTeAZiWMQETNPmc= Bytes: 5149 Lawrence D'Oliveiro [2025-05-12 22:35:57] wrote: > On Mon, 12 May 2025 08:05:56 +0200, Terje Mathisen wrote: >> For reads it allows the disk to always read full sets of sectors, the >> following blocks are likely to be needed soon anyway. > Leave that up to the OS I/O optimization algorithms. Because they know > things about the data that the drive doesn’t. But the drive also knows things about the data that the OS can't know (things that have to do with the physical location of the data on the platters). Which is why it makes sense for both the OS and the drive to make their own efforts. Lawrence D'Oliveiro [2025-05-12 22:39:02] wrote: > On Mon, 12 May 2025 08:41:57 GMT, Anton Ertl wrote: >> On SSDs DRAM cache is also used for storing the logical-to-physical >> sector mapping of the flash translation layer; accessing it on flash is >> apparently too slow. > There is a lot of complicated firmware in SSDs to make them look as > much like a traditional hard drive as possible, so that traditional > hard drive filesystems can be used unchanged. This firmware has been > known to have bugs in it. Bugs is largely attached to "complicated", yes. This said, I've been lucky enough not to bump into any of them in my years of use of SSDs. I admittedly don't push them very hard. > Whereas the Linux kernel includes a few filesystems purpose-designed > for operation on raw flash devices, that integrate wear-levelling etc > right into the block allocation algorithms. Wouldn’t it be much > better (more efficient and more reliable) to get rid of most of that > firmware layer, and use these sorts of filesystems directly? More reliable, I don't know: to get comparable performance, you'll need comparable complexity, so probably comparable amount of bugs. Tho I guess by being exposed to many more eyes (by virtue of being Free Software), it could have a chance of being more reliable, maybe. But in any case, your above argument has some problems: - Those "few filesystems" aren't nearly good enough to compete with a normal filesystem running on top of a typical SSD. Simply because those filesystems have not been designed for those kinds of uses. Last I checked, they don't scale very well to TB sizes, for example. And they haven't seen nearly as much work put into avoiding stuttering and poor performance when the drive is full. More generally, they haven't received nearly as much attention as has been invested in SSDs' "FTL". - The experience with flash technology in the Linux kernel for smaller devices like home routers and such suggests that doing wear leveling in the filesystem is a bad idea because you want to do it over the whole device: no big difference if you have a single filesystem on the whole drive, but for the general case you want something like UBI, i.e. a kind of volume-management system that takes care of spreading the writes over the whole drive as well as remapping defective pages, while still exposing some of the semantics of flash chips, so you need non-standard filesystems on top of that - For better of for worse, drive manufacturers simply have not given access to the "raw" flash layer. I'm not completely sure why, but I get the impression that manufacturers use it as a way to segment the market, with different prices for the same flash chips combined with different FTLs. But maybe at some point, market conditions will change and we'll be able to buy SSDs that can be accessed directly at the flash level? I agree with you in theory, but in practice I think the potential gain is rather small. Maybe the "block device abstraction" isn't such a bad choice in the end. Stefan