Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: My 66000 and High word facility Date: Tue, 20 Aug 2024 00:12:44 +0000 Organization: Rocksolid Light Message-ID: <4b79296ecb02ea68e4d9b4291ce9867e@www.novabbs.org> References: <38055f09c5d32ab77b9e3f1c7b979fb4@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="3172871"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$3CGl51YdH1FoA/5FKX96O.24R7yYEoAgGn2FE88Im5gzm.9uvtD1m X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 4061 Lines: 68 On Mon, 19 Aug 2024 23:35:54 +0000, Brett wrote: > MitchAlsup1 wrote: >> On Mon, 19 Aug 2024 18:52:39 +0000, Brett wrote: >> >>> MitchAlsup1 wrote: >>>> On Sun, 11 Aug 2024 0:46:09 +0000, Brett wrote: >>>> >>>> >>>> The thing is that one you go down the GBOoO route, your lack of >>>> registers >>>> "namable in ASM" ceases to become a performance degrader. With renaming >>>> one can have R7 in use 40 times in a 100 instruction deep execution >>>> window. >>> >>> If this was true we would have 16 or even 8 visible registers, and all >>> would be fine. x86 does mostly fine with 16, of course x84 had fab and >>> cubic dollar advantages that dwarfed the register limit. >> >> Careful, here:: >> >> x86 has LD-OPs and LD-OP-STs which makes the 16 register file feel more >> like it has 20-22 registers. Do not underestimate this phenomenon. The >> gain from 16-32 registers is only 3%-ish so one would estimate that 22 >> registers would have already gained 1/2 of all of what is possible. >> >>> 64 separate registers was a bridge to far, but it was an interesting >>> exercise before it crashed and burned due to the bits being not quite >>> available. So close, yet so far. I could not make it work. >> >> We remain hobbled by the definition of Byte containing exactly 8-bits. >> It is this which drives the 16-bit and 32-bit instruction sizes; and >> it is this which drives the sizes of constants used by the instruction >> stream. >> >> 64 registers makes PERFECT sense in a 36-bit (or 72-bit) architecture. >> But we must all face facts:: >> a) Little Endian Won >> b) 8-bit Bytes Won >> c) longer operands are composed of multiple bytes mostly powers of 2. >> d) otherwise it is merely an academic exercise. >> > > If you pack 7 instructions in 8 long words that gives you an extra > nibble, > 4 bits. > You can do lots of four operand dual operations, which may get you back > the code density lost, while improving performance. Given 36-bit containers--how do you add 32 or 64-bit constants ?? throw 36-bits at the 32-bit needs case and 72-bits at the 64-bit needs case ?!? > 3 instructions packed in 4 longs gives 64 registers plus four operand > dual instructions. {{ note 3 instructions in 4 longs is 85.3-bits per instruction:: I suspect you mean 3 instructions in 4 words which is 42.6-bits per instruction far more than is needed. You get 14 instructions of 36-bits in 512-bits (a cache line)}} Why don't you give it a try !?! But notice, you are starting out with a much larger instruction-- how are you going to "profitably" utilize all those bits from source code of typical imperative languages ?? whereas with 32-bit instructions don't violate the RISC tenets. I end up needing only 72% the number of instructions RISC-V needs (a near 40% pipelined instruction advantage).