| Deutsch English Français Italiano |
|
<20250314162009.000078cf@yahoo.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Michael S <already5chosen@yahoo.com> Newsgroups: comp.arch Subject: Re: rep movsb vs. simpler instructions for memcpy/memmove Date: Fri, 14 Mar 2025 16:20:09 +0200 Organization: A noiseless patient Spider Lines: 29 Message-ID: <20250314162009.000078cf@yahoo.com> References: <vpufbv$4qc5$1@dont-email.me> <20250312140915.000010a8@yahoo.com> <2025Mar12.174636@mips.complang.tuwien.ac.at> <a296144c60c9774898235f505bc4c370@www.novabbs.org> <jwvy0x93vb5.fsf-monnier+comp.arch@gnu.org> <61cab9791f342672dcbd5dfd539cc5cc@www.novabbs.org> <jwv7c4s3n4d.fsf-monnier+comp.arch@gnu.org> <3104c2ab707086659d698a0377450527@www.novabbs.org> <20250313225516.00004206@yahoo.com> <5jIAP.61128$Xq5f.38323@fx38.iad> <20250314001619.000004fa@yahoo.com> <2025Mar14.141837@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Fri, 14 Mar 2025 15:20:12 +0100 (CET) Injection-Info: dont-email.me; posting-host="943fe49e92ae1a26c80cdfc20df6ad74"; logging-data="1515637"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18ZKhG19443V1Kr/+RYz9H9Jf5wLC/9Jik=" Cancel-Lock: sha1:6VPQSMUXSa6tZeA8xTwF5jX55LU= X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32) On Fri, 14 Mar 2025 13:18:37 GMT anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote: > > As for the "transfer level speed", I would not know why delivering to > DRAM should be faster than delivering to L3, L2, or L1. On the > contrary, it seems to me that delivering to DRAM is at least as slow > as the other variants. > Transfer level speed would be faster with DMA, because CPU typically has no way to issue Read requests for chunks of data that are bigger than 64 bytes. OTOH, DMA resides on device itself and uses as big transfer unit as appropriate, up to maximum of 4 KB. In theory, "rep movsb" can generate bigger (than 64B) read transfers, but I don't belive that by now state of the art is that advanced. Besides, on all PCE buses, but especially so on PCIe, write transfers (DMA is doing Write transfer in this case) utilizes bus significantly better than read transfers. The difference is most pronounced for small transfers, but on something like 4-lane PCIe Gen4 the difference can be quite big even when Read transactions uses maximal transfer size. > In any case, that's not what most uses of memcpy() or memmove(), or > rep movsb with their synchronous interfaces are about. > Agreed.