Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: comp.arch Subject: Re: rep movsb vs. simpler instructions for memcpy/memmove Date: Thu, 13 Mar 2025 15:53:25 -0400 Organization: A noiseless patient Spider Lines: 40 Message-ID: References: <2025Mar4.110420@mips.complang.tuwien.ac.at> <2025Mar5.083636@mips.complang.tuwien.ac.at> <2025Mar12.094228@mips.complang.tuwien.ac.at> <20250312114828.00003e99@yahoo.com> <2025Mar12.122836@mips.complang.tuwien.ac.at> <20250312140915.000010a8@yahoo.com> <2025Mar12.174636@mips.complang.tuwien.ac.at> <61cab9791f342672dcbd5dfd539cc5cc@www.novabbs.org> <3104c2ab707086659d698a0377450527@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain Injection-Date: Thu, 13 Mar 2025 20:53:25 +0100 (CET) Injection-Info: dont-email.me; posting-host="8b299a8a9f11694663a9643fc779ea27"; logging-data="3977527"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18XC1UhitbCb8oXvLk9ZIQPCL+RyJkAThI=" User-Agent: Gnus/5.13 (Gnus v5.13) Cancel-Lock: sha1:h+cgY7DO4MCQRtrPsVB15JLPJr8= sha1:Znxrp8FJynus44DuBaP/pYaJekg= Bytes: 3360 MitchAlsup1 [2025-03-13 19:35:33] wrote: [...] >>> On Thu, 13 Mar 2025 16:43:07 +0000, Stefan Monnier wrote: >>>> What is different about MM compared to `rep movsb` [...] > But they never really "tried all that hard" to make them > continuously Optimal. But is there a reason to presume an implementer of My 66000 would have the luxury of putting more efforts into making MM "optimal" than Intel put into making `rep movsb`? > And they have "So Many" extra burdens, Ah, now you seem to be getting to the kind of answer I was looking for. > such as when from is MMI/O space access and to is cache coherent, and > all sorts of other self imposed problems. Using MTRRs one can switch > the kind of memory to and from point in the middle of a REP MOVs. > All of which do nothing to make optimality easier. How does MM avoid those complexities? > My 66000 happens to know that memory space changes will not happen > in the middle of these kinds of things (including vectorized Loops). How does it know? Is it because the ISA just says "don't do that" (I guess MM would then signal an error if it happens?), or is there some underlying difference to the way the semantics/cachability of memory pages is specified which makes it impossible to specify a memory range to MM where the semantics changes partways? > My compilers don't create such problems for HW to solve. {That is; > the truly horrific x86 optimality problems don't exist.} How do compilers getting in the picture? I thought they were basically ignorant of such subtleties of memory caching, as controlled by MTRRs. Stefan