Path: ...!weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB-Alt Newsgroups: comp.arch Subject: Re: "Mini" tags to reduce the number of op codes Date: Fri, 5 Apr 2024 14:46:35 -0500 Organization: A noiseless patient Spider Lines: 77 Message-ID: References: <6mqu0j1jf5uabmm6r2cb2tqn6ng90mruvd@4ax.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Fri, 05 Apr 2024 19:46:37 +0200 (CEST) Injection-Info: dont-email.me; posting-host="abc1a6b059189d9dcd2a246a5a525a8a"; logging-data="1660902"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/YNM25XS5y0IwEaiyxL4rxLPsOzhvfXO0=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:QHssr1XHvhECPWG6NHpUwRgmC54= Content-Language: en-US In-Reply-To: <6mqu0j1jf5uabmm6r2cb2tqn6ng90mruvd@4ax.com> Bytes: 4359 On 4/4/2024 10:13 PM, John Savard wrote: > On some older CPUs, there might be one set of integer opcodes and one > set of floating-point opcodes, with a status register containing the > integer precision, and the floating-point precision, currently in use. > > The idea was that this would be efficient because most programs only > use one size of each type of number, so the number of opcodes would be > the most appropriate, and that status register wouldn't need to be > reloaded too often. > > It's considered dangerous, though, to have a mechanism for changing > what instructions mean, since this could let malware alter what > programs do in a useful and sneaky fashion. Memory bandwidth is no > longer a crippling constraint the way it was back in the days of core > memory and discrete transistors - at least not for program code, even > if memory bandwidth for _data_ often limits the processing speed of > computers. > > This is basically because any program that does any real work, taking > any real length of time to do its job, is going to mostly consist of > loops that fit in cache. So letting program code be verbose if there > are other benefits obtained thereby is the current conventional > wisdom. > This was how the FPU worked in SH-4. Reloading some bits in FPSCR would effectively bank out the current set of FPU instructions (say, between Single and Double, etc). Also it was how 64-bit operations worked in early versions of 64-bit versions of BJX1. Say. there were DQ and JQ bits added to the control register: DQ=0: 32-bit for variable-sized operations (like SH-4) DQ=1: 64-bit for variable-sized operations. JQ=0: 32-bit addressing (SH-4 memory map) JQ=1: 48-bit addressing (like the later BJX2 memory map). The DQ bit would also effect whether one had MOV.W or MOV.Q operations available. DQ=0: Interpret ops as MOV.W (16-bit) DQ=1: Interpret ops as MOV.Q (64-bit) In the DQ=JQ=0 case, it would have been mostly equivalent to SH-4 (and could still run GCC's compiler output). This was a similar situation to switching the FPU mode. Though, a later version of the BJX1 ISA had dropped and repurposed some encodings, allowing MOV.W and MOV.Q to coexist (and avoiding the need for the compiler to endlessly toggle this bit), albeit with fewer addressing modes for the latter. All this was an issue mostly because SH-4 had used fixed-length 16-bit instructions, and the encoding space was effectively almost entirely full when I started (so new instructions required either sacrificing existing instructions, or using mode bits). Though, BJX1 did end up with some 32-bit ops, some borrowed from SH-2A and similar. These were mostly stuck into awkward ad-hoc places in the 16-bit map, so decoding was kind of a pain. .... When I later rebooted things as my BJX2 project, I effectively dropped this whole mess and started over (with the caveat that it lost SH-4 compatibility). However, it has since gained RISC-V compatibility, for better/worse, at least RISC-V is likely to get slightly better performance than SH-4 at least (and both ISA's can be 64-bit). .... > John Savard