Path: ...!news.nobody.at!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Terje Mathisen Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Tue, 17 Sep 2024 08:20:15 +0200 Organization: A noiseless patient Spider Lines: 54 Message-ID: References: <2024Sep10.101932@mips.complang.tuwien.ac.at> <2024Sep11.123824@mips.complang.tuwien.ac.at> <867cbhgozo.fsf@linuxsc.com> <20240912142948.00002757@yahoo.com> <20240915001153.000029bf@yahoo.com> <20240915154038.0000016e@yahoo.com> <32a15246310ea544570564a6ea100cab@www.novabbs.org> <50cd3ba7c0cbb587a55dd67ae46fc9ce@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Tue, 17 Sep 2024 08:20:16 +0200 (CEST) Injection-Info: dont-email.me; posting-host="854a9c90e9fcaa895923e39b84a6c872"; logging-data="3557624"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+n3u4Mx675fRAopy0yvEucYXWPn6AXkLmbgknY3DH+2w==" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.19 Cancel-Lock: sha1:Ua1yc4wTtNKgmvU5yFH48rfxSZo= In-Reply-To: Bytes: 3997 EricP wrote: > These double-width bit-field straddle operations show up at 32-bits. > Various FP64 formats (DEC's middle-endian FP being the worst example), > Intel page table entries and segment/gate descriptors, come to mind. Lots of them in 32-bit code! > > It's just going to take a while for double-width things to show up > at the 64-bit level. But if FP128 becomes a reality... If??? > Codecs likely have to deal with double-width straddles a lot, whatever > the register word size. So for them it likely happens at 64-bits already. Nothing likely about it: LZ4 is pretty much the only compression algorithm/lossless codec that never straddles, all the rest tend to treat the source data as single bitstream of arbitrary length, except for some built-in chunking mechanism which simplifies faster scanning. The core of the algorithm always starts with knowing the endianness, then picking up 32 or 64-bit chunks of input data (byte-flipping if needed) and then extractin the next N bits either from the top of bottom of the buffer register. AlLmost by definition, this is not code that a compiler is setup to help you get correct. > > I added a bunch of instructions for dealing with double-width operations. > The main ISA design decision is whether to have register pair specifiers, > R0, R2, R4,... or two separate {r_high,r_low} registers. > In either case the main uArch issue is that now instructions have an extra > source register and two dest registers, which has a number of consequences. > But once you bite the bullet on that it simplifies a lot of things, > like how to deal with carry or overflow without flags, > full width multiplies, divide producing both quotient and remainder. Very nice! This means that you can do integer IMAC(), right? (hi, lo) = imac(a, b, c); // == a*b+c The only thing even nicer from the perspective of writing arbitrary precision library code would be IMAA, i.e. a*b+c+d since that is the largest combination which is guaranteed to never overflow the double register target field. Terje -- - "almost all programming can be viewed as an exercise in caching"