Path: ...!news.nobody.at!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Terje Mathisen <terje.mathisen@tmsw.no>
Newsgroups: comp.arch
Subject: Re: Computer architects leaving Intel...
Date: Tue, 17 Sep 2024 08:20:15 +0200
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <vcb730$3ci7o$1@dont-email.me>
References: <vaqgtl$3526$1@dont-email.me>
 <p1cvdjpqjg65e6e3rtt4ua6hgm79cdfm2n@4ax.com>
 <2024Sep10.101932@mips.complang.tuwien.ac.at> <ygn8qvztf16.fsf@y.z>
 <2024Sep11.123824@mips.complang.tuwien.ac.at> <vbsoro$3ol1a$1@dont-email.me>
 <867cbhgozo.fsf@linuxsc.com> <20240912142948.00002757@yahoo.com>
 <vbuu5n$9tue$1@dont-email.me> <20240915001153.000029bf@yahoo.com>
 <vc6jbk$5v9f$1@paganini.bofh.team> <20240915154038.0000016e@yahoo.com>
 <vc70sl$285g2$4@dont-email.me> <vc73bl$28v0v$1@dont-email.me>
 <OvEFO.70694$EEm7.38286@fx16.iad>
 <32a15246310ea544570564a6ea100cab@www.novabbs.org>
 <vc7a6h$2afrl$2@dont-email.me>
 <50cd3ba7c0cbb587a55dd67ae46fc9ce@www.novabbs.org>
 <vc8qic$2od19$1@dont-email.me> <fCXFO.4617$9Rk4.4393@fx37.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 17 Sep 2024 08:20:16 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="854a9c90e9fcaa895923e39b84a6c872";
	logging-data="3557624"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+n3u4Mx675fRAopy0yvEucYXWPn6AXkLmbgknY3DH+2w=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
 Firefox/91.0 SeaMonkey/2.53.19
Cancel-Lock: sha1:Ua1yc4wTtNKgmvU5yFH48rfxSZo=
In-Reply-To: <fCXFO.4617$9Rk4.4393@fx37.iad>
Bytes: 3997

EricP wrote:
> These double-width bit-field straddle operations show up at 32-bits.
> Various FP64 formats (DEC's middle-endian FP being the worst example),
> Intel page table entries and segment/gate descriptors, come to mind.

Lots of them in 32-bit code!
> 
> It's just going to take a while for double-width things to show up
> at the 64-bit level. But if FP128 becomes a reality...

If???

> Codecs likely have to deal with double-width straddles a lot, whatever
> the register word size. So for them it likely happens at 64-bits already.

Nothing likely about it: LZ4 is pretty much the only compression 
algorithm/lossless codec that never straddles, all the rest tend to 
treat the source data as single bitstream of arbitrary length, except 
for some built-in chunking mechanism which simplifies faster scanning.

The core of the algorithm always starts with knowing the endianness, 
then picking up 32 or 64-bit chunks of input data (byte-flipping if 
needed) and then extractin the next N bits either from the top of bottom 
of the buffer register.

AlLmost by definition, this is not code that a compiler is setup to help 
you get correct.

> 
> I added a bunch of instructions for dealing with double-width operations.
> The main ISA design decision is whether to have register pair specifiers,
> R0, R2, R4,... or two separate {r_high,r_low} registers.
> In either case the main uArch issue is that now instructions have an extra
> source register and two dest registers, which has a number of consequences.
> But once you bite the bullet on that it simplifies a lot of things,
> like how to deal with carry or overflow without flags,
> full width multiplies, divide producing both quotient and remainder.

Very nice!

This means that you can do integer IMAC(), right?

(hi, lo) = imac(a, b, c); // == a*b+c

The only thing even nicer from the perspective of writing arbitrary 
precision library code would be IMAA, i.e. a*b+c+d since that is the 
largest combination which is guaranteed to never overflow the double 
register target field.

Terje

-- 
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"