Deutsch English Français Italiano |
<f40fa64b4d719b47fb3ab79ca334ebc3@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder6.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: 88xxx or PPC Date: Sat, 9 Mar 2024 04:17:04 +0000 Organization: Rocksolid Light Message-ID: <f40fa64b4d719b47fb3ab79ca334ebc3@www.novabbs.org> References: <uigus7$1pteb$1@dont-email.me> <ac55c75a923144f72d204c801ff7f984@www.novabbs.org> <20240303165533.00004104@yahoo.com> <2024Mar3.173345@mips.complang.tuwien.ac.at> <20240303203052.00007c61@yahoo.com> <2024Mar3.232237@mips.complang.tuwien.ac.at> <20240304171457.000067ea@yahoo.com> <2024Mar4.191835@mips.complang.tuwien.ac.at> <20240305001833.000027a9@yahoo.com> <0c2e37386287e8a0303191dc7b989c76@www.novabbs.org> <us5t1c$36voh$1@dont-email.me> <df173cbc4fb74394f9d03f285f9381f3@www.novabbs.org> <3hGFN.115182$m4d.77183@fx43.iad> <0cc87b9d559c4f79b9b2d7663fa3ccbf@www.novabbs.org> <us8l5d$6ae9$1@dont-email.me> <7d218b002494ff0fedd0abd386f7aa08@www.novabbs.org> <usgid9$20vho$3@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="1360318"; mail-complaints-to="usenet@i2pn2.org"; posting-account="PGd4t4cXnWwgUWG9VtTiCsm47oOWbHLcTr4rYoM0Edo"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$hLJAVgwXGvYm0ViKo94sM.zKabPbInaWSPuoz0MjpoCcIGSwYCR/u X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 3776 Lines: 47 Paul A. Clayton wrote: > On 3/6/24 3:00 PM, MitchAlsup1 wrote: >> Paul A. Clayton wrote: > [snip] >>> It seems that 64-bit >>> stack-pointer-relative accesses could be roughly as fast by using >>> the offset as the index (each stack frame would be comparable to a >>> different thread register context; the tradeoffs of extra storage >>> for multiple stack frames ("multithreading" — alternating between >>> indexing up and indexing down would provide some utilization >>> flexibility with low indexing overhead) relative to pushing out >>> early frames (normal "context switch"); such a cache would >>> probably be limited in frame size cached. >> >> Smells too much like register windows which never outperformed >> the flat RF from MIPS. In any event, 50% of subroutines need no >> stack <accesses> and those that do typically only store 3 registers >> (for restore later). > Register windows were intended to avoid save/restore overhead by > retaining values in registers with renaming. A stack cache is > meant to reduce the overhead of loads and stores to the stack — > not just preserving and restoring registers. A direct-mapped stack > cache is not entirely insane. A partial stack frame cache might > cache up to 256 bytes (e.g.) with alternating frames indexing with > inverted bits (to reduce interference) — one could even reserve a > chunk (e.g., 64 bytes) of a frame and not overlapped by limiting > offset cached to be smaller than the cache. > Such might be more useful than register windows, but that does > not mean that it is actually a good option. If it is such a good option why has it not reached production ?? >>> An L2 register set that can only be accessed for one operand >>> might be somewhat similar to LD-OP. >> >> In high speed designs, there are at least 2 cycles of delay from AGEN >> to the L2 and 2 cycles of delay back. Even zero cycle access sees at >> least 4 cycles of latency, 5 if you count AGEN. > Presumably this is related to the storage technology used as well > as the capacity. Purely wire delay due to the size of the L2 cache. >