Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Michael S Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Tue, 17 Sep 2024 13:27:16 +0300 Organization: A noiseless patient Spider Lines: 73 Message-ID: <20240917132716.000010be@yahoo.com> References: <2024Aug30.161204@mips.complang.tuwien.ac.at> <2024Sep14.152652@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Injection-Date: Tue, 17 Sep 2024 12:26:52 +0200 (CEST) Injection-Info: dont-email.me; posting-host="230b2ded398db3592516a771801d7f1a"; logging-data="3586719"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1//rAnTLaWpB9GoY9Ph0EfppUDa4Rqg6bg=" Cancel-Lock: sha1:sEdAr4+k7fppdvhGXQ8Me4Etr00= X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32) Bytes: 4476 On Tue, 17 Sep 2024 11:48:12 +0200 Terje Mathisen wrote: > David Brown wrote: > > On 17/09/2024 08:07, Terje Mathisen wrote: =20 > >> David Brown wrote: =20 > >>> On 16/09/2024 10:37, Terje Mathisen wrote: =20 > >>>> This becomes much simpler in Rust where usize is the only legal=20 > >>>> index type: > >>>> > >>>> Yeah, you have to actually write it as > >>>> > >>>> =C3=82=C2=A0=C3=83=E2=80=9A=C3=82=C2=A0 y =3D p[x]; > >>>> =C3=82=C2=A0=C3=83=E2=80=9A=C3=82=C2=A0 x +=3D 1; > >>>> > >>>> instead of a single line, but this makes zero difference to the=20 > >>>> compiler, right? > >>>> =20 > >>> > >>> I don't care much about the compiler - but I don't think this is > >>> an improvement for the programmer.=C3=82=C2=A0 (In general, I dislike > >>> trying to do too much in a single expression or statement, but > >>> some C constructs are common enough that I am happy with them.=C3=82 > >>> It would be hard to formulate concrete rules here.) > >>> > >>> And the resulting object code is less efficient than you get with=20 > >>> signed int and "y =3D p[x++];" (or "y =3D p[x]; x++;") in C. =20 > >> > >> Is that true? I'll have to check godbolt myself if that is really > >> the case! > >> =20 > >=20 > > It is not true - or at least, it shouldn't be true.=C2=A0 I had thought > > the Rust code was using the equivalent of a C "unsigned int" here, > > which would require extra code for wrapping semantics.=C2=A0 But that > > was just my misunderstanding of Rust and its types - with a 64-bit > > unsigned type, it should give the same results as C.=C2=A0 However, > > there's no harm in checking it and letting us know. =20 >=20 > No need to check this particular point, Rust's usize was obviously=20 > designed to be an unsigned type large enough to index into the entire=20 > addressable memory range, so on a 64-bit platform it has to be 64 > bits. > >=20 > > (I've previously shown how "y =3D p[x++];" in C is less efficient on=20 > > x86-64 if x is "unsigned int", compared to "int" or 64-bit types > > for x.)=20 > That's actually surprising to me, I would have guessed any 32-bit > index would be less efficient than a full-width type, but if the > idionm is very, very common in C code, then it makes sense to make it > fast. >=20 > Doing so would typically require either sign- or zero-extending all=20 > 32-bit variables when loaded into a 64-bit register, right? >=20 > Terje >=20 Taken in isolation, on something like x86=3D64 or aarch64, where result of 32-bit addition is by default zero-extended, there is no difference between 32-bit and 64-bit unsigned x. However when statement shown above is part of the sequence, even short one, 64-bit x allows compiler optimizations that are impossible with 32-bit. E.g. y1 =3D p[x++] y2 =3D p[x++] On x86-64 with 64-bit x the second load can be implemented as mov dstreg, [rcx+rdx*4+4] On aarch64 with 64-bit x both loads can be folded into single 'load pair' instruction.