Deutsch   English   Français   Italiano  
<20240909145854.00001e4e@yahoo.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Michael S <already5chosen@yahoo.com>
Newsgroups: comp.arch
Subject: Re: Computer architects leaving Intel...
Date: Mon, 9 Sep 2024 14:58:54 +0300
Organization: A noiseless patient Spider
Lines: 82
Message-ID: <20240909145854.00001e4e@yahoo.com>
References: <2024Aug30.161204@mips.complang.tuwien.ac.at>
	<vb3k0m$1rth7$1@dont-email.me>
	<17d615c6a9e70e9fabe1721c55cfa176@www.novabbs.org>
	<86v7zep35n.fsf@linuxsc.com>
	<20240902180903.000035ee@yahoo.com>
	<vb7ank$3d0c5$1@dont-email.me>
	<20240903190928.00002f92@yahoo.com>
	<vb7idh$3e2af$1@dont-email.me>
	<86seufo11j.fsf@linuxsc.com>
	<vba6qa$3u4jc$1@dont-email.me>
	<1246395e530759ac79805e45b3830d8f@www.novabbs.org>
	<8634m9lga1.fsf@linuxsc.com>
	<vbmb3h$2bfqh$1@dont-email.me>
	<20240909122219.00007f81@yahoo.com>
	<2024Sep9.123034@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 09 Sep 2024 13:58:32 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="45fff2496b15112b5e4e03cadfa28742";
	logging-data="2041887"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19oaQfBKEy1NW4RJOZ4c8n8yY7jner73EU="
Cancel-Lock: sha1:1lP2t5+M8Rq8+WNNFPLlixGKNQ4=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
Bytes: 4046

On Mon, 09 Sep 2024 10:30:34 GMT
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

> Michael S <already5chosen@yahoo.com> writes:
> >On Mon, 9 Sep 2024 10:20:00 +0200
> >Terje Mathisen <terje.mathisen@tmsw.no> wrote:  
> >> float invsqrt(float x)
> >> {
> >>    ...
> >>    int32_t ix = *(int32_t *) &x;  
> [...]
> >>    int32_t ix;
> >>    memcpy(&ix, &x, sizeof(ix));  
> ...
> >I don't know if it is always true in more complex cases, where
> >absence of aliasing is less obvious to compiler.  
> 
> Something like
> 
> memmove(*p, *q, 8)
> 
> can be translated to something like
> 
>    0:   48 8b 06                mov    (%rsi),%rax
>    3:   48 89 07                mov    %rax,(%rdi)
> 
> without any aliasing worries, and indeed, gcc-9, gcc-10, and gcc-12,
> does that.
> 
> >However, I'd expect that as
> >long as a copied item fits in register, the magic will work equally
> >with both memcpy and memmove.  
> 
> One would hope so, but here's what happens with gcc-12:
> 
> #include <string.h>
> 
> void foo1(char *p, char* q)
> {
>   memcpy(p,q,32);
> }
> 
> void foo2(char *p, char* q)
> {
>   memmove(p,q,32);
> }
> 
> gcc -O3 -mavx2 -c -Wall xxx-memmove.c ; objdump -d xxx-memmove.o:
> 
> 0000000000000000 <foo1>:
>    0:   c5 fa 6f 06             vmovdqu (%rsi),%xmm0
>    4:   c5 fa 7f 07             vmovdqu %xmm0,(%rdi)
>    8:   c5 fa 6f 4e 10          vmovdqu 0x10(%rsi),%xmm1
>    d:   c5 fa 7f 4f 10          vmovdqu %xmm1,0x10(%rdi)
>   12:   c3                      ret
>   13:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
>   1a:   00 00 00 00 
>   1e:   66 90                   xchg   %ax,%ax
> 
> 0000000000000020 <foo2>:
>   20:   ba 20 00 00 00          mov    $0x20,%edx
>   25:   e9 00 00 00 00          jmp    2a <foo2+0xa>
> 
> The jmp in line 25 is probably a tail-call to memmove().
> 
> My guess is that xmm registers and unrolling are used here rather than
> ymm registers because waking up the second 128 bits takes time.  But
> even with that, the code uses two different registers, and if
> scheduled differently, could be used for implementing foo2():
> 
>    0:   c5 fa 6f 06             vmovdqu (%rsi),%xmm0
>    8:   c5 fa 6f 4e 10          vmovdqu 0x10(%rsi),%xmm1
>    4:   c5 fa 7f 07             vmovdqu %xmm0,(%rdi)
>    d:   c5 fa 7f 4f 10          vmovdqu %xmm1,0x10(%rdi)
>   12:   c3                      ret
> 
> - anton

Try -march instead of -mavx2. E.g. -march=haswell
Sometimes gcc is beyond logic.