Deutsch English Français Italiano |
<v9c9mk$3615s$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.nobody.at!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Terje Mathisen <terje.mathisen@tmsw.no> Newsgroups: comp.arch Subject: Re: My 66000 and High word facility Date: Mon, 12 Aug 2024 08:22:11 +0200 Organization: A noiseless patient Spider Lines: 87 Message-ID: <v9c9mk$3615s$1@dont-email.me> References: <v98asi$rulo$1@dont-email.me> <38055f09c5d32ab77b9e3f1c7b979fb4@www.novabbs.org> <v991kh$vu8g$1@dont-email.me> <2024Aug11.163333@mips.complang.tuwien.ac.at> <v9b57p$2rkrq$1@dont-email.me> <v9brm4$33kmd$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Injection-Date: Mon, 12 Aug 2024 08:22:12 +0200 (CEST) Injection-Info: dont-email.me; posting-host="b8a91e31d81c25ba193a274ddf258e3d"; logging-data="3343548"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+TTuqA4FfHzVVVFbNN9DUSc9Oj10JhwZm8BOwgLiHfAg==" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2 Cancel-Lock: sha1:LQ7isaAaF5tePyRRPLRPovQqshE= In-Reply-To: <v9brm4$33kmd$1@dont-email.me> Bytes: 4517 Brett wrote: > BGB <cr88192@gmail.com> wrote: >> On 8/11/2024 9:33 AM, Anton Ertl wrote: >>> Brett <ggtgp@yahoo.com> writes: >>>> The lack of CPU=C3=A2=E2=82=AC=E2=84=A2s with 64 registers is what m= akes for a market, that 4% >>>> that could benefit have no options to pick from. >>> >>> They had: >>> >>> SPARC: Ok, only 32 GPRs available at a time, but more in hardware >>> through the Window mechanism. >>> >>> AMD29K: IIRC a 128-register stack and 64 additional registers >>> >>> IA-64: 128 GPRs and 128 FPRs with register stack and rotating registe= r >>> files to make good use of them. >>> >>> The additional registers obviously did not give these architectures a= >>> decisive advantage. >>> >>> When ARM designed A64, when the RISC-V people designed RISC-V, and >>> when Intel designed APX, each of them had the opportinity to go for 6= 4 >>> GPRs, but they decided not to. Apparently the benefits do not >>> outweigh the disadvantages. >>> >> >> In my experience: >> For most normal code, the advantage of 64 GPRs is minimal; >> But, there is some code, where it does have an advantage. >> Mostly involving big loops with lots of variables. >> >> >> Sometimes, it is preferable to be able to map functions entirely to >> registers, and 64 does increase the probability of being able to do so= >> (though, neither achieves 100% of functions; and functions which map >> entirely to GPRs with 32 will not see an advantage with 64). >> >> Well, and to some extent the compiler needs to be selective about whic= h >> functions it allows to use all of the registers, since in some cases a= >> situation can come up where the saving/restoring more registers in the= >> prolog/epilog can cost more than the associated register spills. >=20 >=20 > Another benefit of 64 registers is more inlining removing calls. >=20 > A call can cause a significant amount of garbage code all around that c= all, > as it splits your function and burns registers that would otherwise get= > used. >=20 > I can understand the reluctance to go to 6 bit register specifiers, it > burns up your opcode space and makes encoding everything more difficult= =2E > But today that is an unserviced market which will get customers to give= you > a look. Put out some vapor ware and see what customers say. The solution (?) have always looked obvious to me: Some form of huffmann = encoding of register specifiers, so that the most common ones (bottom 16 = or 32) require just a small amount of space (as today), and then either=20 a prefix or a suffix to provide extra bits when you want to use those=20 higher register numbers. Mitch's CARRY sets up a single extra register=20 for a set of operations, a WIDE prefix could contain two extra register=20 bits for four registers over the next 2 or 3 instructions. As long as this doesn't make the decoder a speed limiter, it would be=20 zero cost for regular code and still quite cheap except for increasing=20 code size by 33-50% for the inner loops of algorithms that need 64 or=20 even 128 regs. Terje --=20 - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"