Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Michael S Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Mon, 9 Sep 2024 16:30:50 +0300 Organization: A noiseless patient Spider Lines: 41 Message-ID: <20240909163050.00004ae8@yahoo.com> References: <2024Aug30.161204@mips.complang.tuwien.ac.at> <86v7zep35n.fsf@linuxsc.com> <20240902180903.000035ee@yahoo.com> <20240903190928.00002f92@yahoo.com> <86seufo11j.fsf@linuxsc.com> <1246395e530759ac79805e45b3830d8f@www.novabbs.org> <8634m9lga1.fsf@linuxsc.com> <20240909122219.00007f81@yahoo.com> <2024Sep9.123034@mips.complang.tuwien.ac.at> <20240909145854.00001e4e@yahoo.com> <2024Sep9.142813@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Mon, 09 Sep 2024 15:30:27 +0200 (CEST) Injection-Info: dont-email.me; posting-host="45fff2496b15112b5e4e03cadfa28742"; logging-data="2041887"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18HcZqApVhce6PJtw0ylN95E4ygan1a6CE=" Cancel-Lock: sha1:pUMcQCuWVxeE1agi78Rc/lhq4cE= X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32) Bytes: 2634 On Mon, 09 Sep 2024 12:28:13 GMT anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote: > > But when changing the length to 63: > > 0000000000000000 : > 0: c5 fe 6f 06 vmovdqu (%rsi),%ymm0 > 4: c5 fe 7f 07 vmovdqu %ymm0,(%rdi) > 8: c5 fe 6f 4e 1f vmovdqu 0x1f(%rsi),%ymm1 > d: c5 fe 7f 4f 1f vmovdqu %ymm1,0x1f(%rdi) > 12: c5 f8 77 vzeroupper > 15: c3 ret > 16: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) > 1d: 00 00 00 > > 0000000000000020 : > 20: ba 3f 00 00 00 mov $0x3f,%edx > 25: e9 00 00 00 00 jmp 2a > > - anton An interesting question is which code I want in this case. In absence of -march options and with -O1|2|3 I want something like that: foo2: movups (%rsi), %xmm0 movups 16(%rsi), %xmm1 movups 32(%rsi), %xmm2 movups 47(%rsi), %xmm3 movups %xmm0, (%rsi) movups %xmm1, 16(%rsi) movups %xmm2, 32(%rsi) movups %xmm3, 47(%rsi) ret Without deep thinking I don't see why I would want anything different for foo1().