Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Arguments for a sane ISA 6-years later Date: Sun, 28 Jul 2024 01:27:49 +0000 Organization: Rocksolid Light Message-ID: <34c644c1d46281921163b589b3f5e2ae@www.novabbs.org> References: <034bc00e088a2cb40307e73ce30dcb2f@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="598196"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Rslight-Site: $2y$10$gizQLBiP9HdEz3ZW3BxTzukRi6zgQon0/dwB.YsIqlug1mAUlJIB6 Bytes: 3127 Lines: 45 On Sun, 28 Jul 2024 1:01:59 +0000, Paul A. Clayton wrote: > On 7/25/24 6:07 PM, MitchAlsup1 wrote: >> On Thu, 25 Jul 2024 20:09:06 +0000, BGB wrote: >> >>> On 7/24/2024 3:37 PM, MitchAlsup1 wrote: > [snip] >>>> D) exception and interrupt control transfer should take no more >>>> ..than 1 cache line read followed by 4 cache line reads to the >>>> ..same page in DRAM/L3/L2 that are dependent on the first cache >>>> ..line read. Control transfer back to the suspended thread should >>>> ..be no longer than the control transfer to the exception handler. > [snip] >>> A fast, but more expensive, option would be to have multiple >>> copies of >>> the register file which is then bank-switched on an interrupt. >> >> Under My 66000 a low end implementation can choose the write back >> cache >> version, while the GBOoO implementation can choose the bank switcher. >> In both cases, the same model is presented to executing SW. > > I do not know at what port count a "3D register file" (temporal > banking where extra storage "hides" under the wires) makes sense. > I suspect the 3-read, 1-write register file of a low end My 66000 > implementation would have the overhead be too great unless lower > overhead context switching was extremely important. The low end implementation has a single 4=ported register file. When running code it is accessed as 3R-1W, but when context switching it is accessed as 4R or 4W depending on the cycle. The sequencer operates it like a write back cache, so if the code has not used R16-R23 since receiving control , those registers are consistent with the already saved in memory registers, and no writes are necessary. As to the higher end machine, thee would be an SRAM organized as 4-contexts of 32-regsiters each where each port can read or write 8×64 bits per cycle, so to bank switch, one does 4 writes and then 4 reads. In both cases, all the fancy stuff is hidden from SW. In neither case are there more than 32 actual registers in the file nor are there more ports than decoders.