Deutsch English Français Italiano |
<vefdlb$hc99$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!2.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Paul A. Clayton" <paaronclayton@gmail.com> Newsgroups: comp.arch Subject: Re: 80286 protected mode Date: Sat, 12 Oct 2024 23:09:27 -0400 Organization: A noiseless patient Spider Lines: 68 Message-ID: <vefdlb$hc99$1@dont-email.me> References: <2024Oct6.150415@mips.complang.tuwien.ac.at> <memo.20241006163428.19028W@jgd.cix.co.uk> <2024Oct7.093314@mips.complang.tuwien.ac.at> <7c8e5c75ce0f1e7c95ec3ae4bdbc9249@www.novabbs.org> <2024Oct8.092821@mips.complang.tuwien.ac.at> <ve5ek3$2jamt$1@dont-email.me> <ve6gv4$2o2cj$1@dont-email.me> <ve6olo$2pag3$2@dont-email.me> <73e776d6becb377b484c5dcc72b526dc@www.novabbs.org> <ve7sco$31tgt$1@dont-email.me> <2b31e1343b1f3fadd55ad6b87d879b78@www.novabbs.org> <ve99fg$38kta$1@dont-email.me> <35cb536e6310a38f0269788881cffdaf@www.novabbs.org> <veb4j5$3kjt3$2@dont-email.me> <ab65eba51e4d4adc988e54df4a5fc7eb@www.novabbs.org> <ved03t$1uut$1@dont-email.me> <veec3b$8kmg$1@dont-email.me> <veeefe$91cc$1@dont-email.me> <617c3589c277069092809f18d4449100@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sun, 13 Oct 2024 05:09:32 +0200 (CEST) Injection-Info: dont-email.me; posting-host="2418c27d2dad02138a68bf1a93d5c026"; logging-data="569641"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18EKBODd9AZpg/+Q98E5S4lKp6tUxbcr+8=" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.0 Cancel-Lock: sha1:Gvfqu93tVSZSGMX76fHA8V7xbgY= In-Reply-To: <617c3589c277069092809f18d4449100@www.novabbs.org> Bytes: 4968 On 10/12/24 2:37 PM, MitchAlsup1 wrote: > On Sat, 12 Oct 2024 18:17:18 +0000, Brett wrote: [snip] >> Worst case the source and dest are in cache, and the count is >> 150 cycles >> away in memory. So hundreds of chars could be copied until the >> value is >> loaded and that count value could be say 5. > > The instruction cannot start until the count in known. You don't > start > an FMAC until all 3 operands are ready, either. This is not _strictly_ true. Some ARM implementations start an FMADD before the addend is available when it is known that it will be available in time. This allows dependent accumulation with a latency equal to the ADD part. One might even be able to start the shift to align addend and product early as this value is easy to calculate for normal FP values. In many microarchitectures, an operation will be scheduled to execute when an L1 cache hit would be expected to make an operand available. I.e., the instruction "starts" before the operand is actually available. With branch prediction, a branch instruction is "started" before the condition has been evaluated. Your statement implies that My 66000 MM implementations will not do such prediction. In the case of a memory copy, performing rollback of misspeculation is potentially much easier than in the general case of a loop with store operations. Memory copy also facilitates deeper speculation. The source data can be preserved in memory more readily than arbitrary sequences of register contents. If both source and destination start points are known, destination reads can be translated into source reads within a speculation domain. (The source could also be prefetched before the destination is known.) It does seem that My 66000's MM does not completely eliminate the potential for faster special case software even if every implementation is perfect. Software might know that the tail part of a cache block that is not overwritten is dead data. This can avoid a read for ownership of the last destination block, software could do a cache block zero for the last block and then copy the data over that. This special case might apply for appending to a buffer. I do not know that adding a MM instruction variant to handle that special case would be worthwhile. I am skeptical that all implementations of MM would be perfect, i.e., perform at least as well as software more specifically controlling hardware if such control had been provided by the ISA. E.g., ISA support for byte-masks for stores might not only allow non-contiguous stores (such as updating more than one field in a structure while leaving other intermediately placed fields unchanged) but might have higher performance than a general MM if the source happened to be replicated in a register. "Hard cases make bad law" may be generalized to special cases make bad (general) interfaces. Clean interfaces that can be implemented almost optimally have advantages over complicated interfaces that can theoretically handle more cases optimally **if one uses the proper (highly specific) incantation!!!**