Deutsch English Français Italiano |
<vff6vd$31kl9$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB <cr88192@gmail.com> Newsgroups: comp.arch Subject: Re: x86S Specification Date: Thu, 24 Oct 2024 23:31:35 -0500 Organization: A noiseless patient Spider Lines: 265 Message-ID: <vff6vd$31kl9$1@dont-email.me> References: <dqfQO.411015$WOde.295848@fx09.iad> <vf6j1l$144cr$1@dont-email.me> <3c6510cc947a1b59b62753de4cf98293@www.novabbs.org> <vf6ucr$g6j$1@gal.iecc.com> <2024Oct22.172620@mips.complang.tuwien.ac.at> <vf8rov$1jsqv$1@dont-email.me> <5d79e4ceda7bf46346a80da098645adc@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Fri, 25 Oct 2024 06:31:42 +0200 (CEST) Injection-Info: dont-email.me; posting-host="73a54a87420ae18ff9fa03b74f73f488"; logging-data="3199657"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+aBNlv7Ix+OKAsxP8/Ihodxg2wBDmdrWs=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:bZ97h0j9IqFuB6yZExGx78cQExg= In-Reply-To: <5d79e4ceda7bf46346a80da098645adc@www.novabbs.org> Content-Language: en-US Bytes: 11059 On 10/22/2024 4:13 PM, MitchAlsup1 wrote: > On Tue, 22 Oct 2024 18:43:40 +0000, BGB wrote: > >> On 10/22/2024 10:26 AM, Anton Ertl wrote: >>> >>> Several things in this paragraph makes no sense. >>> >>> In particular, x86S is a proposal for a reduced version of the stuff >>> that current Intel and AMD CPUs support: There is full 64-bit support, >>> and 32-bit user-level support. x86S eliminates a part of the >>> compatibility path from systems of yesteryear, but not that many >>> people use these parts nowadays anyway. It's unclear to me what >>> benefits these changes are supposed to buy (unlike the elimination of >>> A32/T32 from some ARM chips, which obviously eliminates the whole >>> A32/T32 decoding path). It seems to me that most of the complexity of >>> current CPUs would still be there. >>> >>> And I certainly prefer a CPU that has more capabilities to one that >>> has less capabilities. Sometimes I want to run old binaries. >>> >>> So what would be my incentive as a user to buy an x86S CPU? Will they >>> sell them for less? I doubt it. >>> >> >> Yeah, basically my thoughts as well. >> Business as usual... >> >> Main effect it achieves is breaking legacy boot, doesn't seem like it >> would either save all that much nor "solve" x86's longstanding issues. > > Intel needs a better way to exit reset--and that means the MMU/TLBs > are already up and working at the time reset is exited. This cannot > be made backwards compatible. > ------------------------------- I am not sure how this would have much effect on cost either way. A physical address mode could just be some edge case logic in the MMU (say, whenever there is a TLB miss with MMU disabled, it merely loads an identity mapped address into the TLB). >> >> *1: Probably, say (if I were designing the encoding): >> {Rb+Disp10s] //32-bit encoding >> {Rb+Ri*FixSc] //32-bit encoding >> {Rb+Ri*Sc] //64-bit encoding >> [Rb+Disp33s] //64-bit encoding >> [Rb+Ri*Sc+Disp11s] //64-bit encoding >> [Rb+Ri*Sc+Disp33s] //96-bit encoding > > [Rb+DISP16] // 32-bit 16 > 10 > [Rb+Ri<<sc] // 32-bit > [Rb+Ri<<sc+DISP32] // 64-bit 32 > 11 > [Rb+Ri<<sc+DISP64] // 96-bit 64 > 33 One doesn't want to burn too much encoding space... If the goal is to redesign x86 as a RISC-like ISA, one is likely going to need a lot of space for opcode bits. This is partly why I was thinking 32 registers rather than 64, along with the smaller immediate fields. Say, one possible encoding scheme would be to use a similar base format to RISC-V: ZZZZZZZ-ttttt-mmmmm-ZZZ-nnnnn-YY-YYYY1 //32-bit op ZZZZZZZ-ttttt-mmmmm-ZZZ-nnnnn-YY-YYYY0 //64/96-bit op Then, say: 1/2 the 32-bit encoding space is 3R ops: 1/4 the 32-bit encoding space is 3RI ops: Remaining 1/4 for Imm16 and JMP/JCC and similar. Say, could burn a 24/25-bit chunk of encoding space on JMP/CALL/JCC iiiiiii-iiiii-iiiii-iii-Zcccc-YY-YYYY1 Where: cccc is like x86 Jcc condition code, but maybe reuse P and NP for JMP and CALL. Though, might make sense to do CALL/RET using a link-register rather than the stack, even if x86 traditionally used the stack. For 64-bit: LD/ST/OPLD/OPST: [Rb+Disp10] expands to [Rb+Disp33s] LD/ST/OPLD/OPST: [Rb+Ri*Sc] expands to [Rb+Ri*Sc+Disp11s] or Disp17s. Remaining bits go to opcode. Say: ZZZZZZZ-ttttt-mmmmm-dss-nnnnn-YY-YYYY1 //MEM [Rm+Rt*Sc] And: iiiiiii-iiiii-iiiii-xxx-xxxxx-xx-xxxx0 - ZZZZZZZ-ttttt-mmmmm-dss-nnnnn-YY-YYYY1 //MEM [Rm+Rt*Sc+Disp17s] And: iiiiiii-iiiii-iiiii-iii-iiiii-ii-iiii0 - kkkkkkk-kkkkk-kkkkk-xxx-xxxxx-ii-xxxx0 - ZZZZZZZ-ttttt-mmmmm-dss-nnnnn-YY-YYYY1 //MEM [Rm+Rt*Sc+Disp33s] Could maybe use some of the extra bits encoding things like: ADD.Q [Rb+Ri*Sc+Disp33s], Imm17s. Or: ADD.Q [Rb+Ri*Sc+Disp17s], Imm33s. Say, by having a Rn/Imm bit, and a bit to specify which immediate is used as the constant and the other as the displacement. But, with Disp10 base-forms, might expand to Disp33: iiiiiii-iiiii-iiiii-xxx-iiiii-xx-xxxx0 - iiiiiZZ-iiiii-mmmmm-dZi-nnnnn-YY-YYYY1 //MEM [Rm+Rt*Sc+Disp17s] Where the 'd' flag could select between, say: "ADD Rn, [Rm+Disp]" or "ADD [Rm+Disp], Rn" 32-bit encodings only allowing a register, whereas 64-bit encodings could allow an immediate. But, not really sure... In other news, went and wrote up a spec and threw together Verilog code for a reworked BSR4K/XG3 ISA design: https://pastebin.com/yfrh50bk There are still some holes (the spec is missing pretty much all the 2R ops for now), but alas. A few parts I have decided would not necessarily be carried over, as some newer instructions and the addition of a Zero Register made some amount of the former 2R and 2RI instructions no longer necessary (though, some could still be useful for efficiency; or have other useful roles like format conversion). To make implementation cheaper/easier for me, it is essentially XG2RV with the bits shuffled around, a few inverted, and some special case changes (changes branch mechanics and some edge cases involving decoding immediate values). Initially I tried putting the repacking logic at the front end of the ID stage, but (unsurprisingly), synthesis and timing wasn't too happy about this... Ended up instead putting the repack logic at the end of the IF stage. There was another possible idea that I could call BSR4J: Would have done a simpler repacking scheme: First 16 bits are repacked: NMOP-YwYY-nnnn-mmmm => NMOY-mmmm-nnnn-YYPw High 16 bits copied unmodified. So, overall instruction format, seen as 32-bits, could have been: ZZZZ-qnmo-oooo-XXXX-NMOY-mmmm-nnnn-YYPw But, it was admittedly more tempting, if I am going to be repacking anyways, to make an attempt to "un-dog-chew" the instruction format (in an attempt to make it look nicer). It is not fully settled yet, could jump over to the BSR4J strategy instead if the more aggressive repacking scheme is in-fact a bad idea. One arguable merit if does have is that all of the original 4-bit fields remain 4-bit aligned (and converting between XG2 and BSR4J would be significantly less bit-twiddling vs BSR4K; while still achieving the goal of being able to fit it into the same encoding space as RISC-V). I have yet to decide on some specifics for the mapping of 2R instructions: Simpler/cheaper: Use the same repacking as 3R ops for 2R ops; ========== REMAINDER OF ARTICLE TRUNCATED ==========