Deutsch English Français Italiano |
<80b47109a4c8c658ca495b97b9b10a54@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!weretis.net!feeder6.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Microarch Club Date: Mon, 25 Mar 2024 22:17:03 +0000 Organization: Rocksolid Light Message-ID: <80b47109a4c8c658ca495b97b9b10a54@www.novabbs.org> References: <uti24p$28fg$1@nnrp.usenet.blueworldhosting.com> <utsrft$1b76a$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="3220419"; mail-complaints-to="usenet@i2pn2.org"; posting-account="PGd4t4cXnWwgUWG9VtTiCsm47oOWbHLcTr4rYoM0Edo"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$CfG8MEZfWDKPaqyjnTBz6.WDUVoRyE7VHHWsRMIGMHmsKtl9.s1xG X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 4388 Lines: 83 BGB-Alt wrote: > On 3/21/2024 2:34 PM, George Musk wrote: >> Thought this may be interesting: >> https://microarch.club/ >> https://www.youtube.com/@MicroarchClub/videos > At least sort of interesting... > I guess one of the guys on there did a manycore VLIW architecture with > the memory local to each of the cores. Seems like an interesting > approach, though not sure how well it would work on a general purpose > workload. This is also closer to what I had imagined when I first > started working on this stuff, but it had drifted more towards a > slightly more conventional design. > But, admittedly, this is for small-N cores, 16/32K of L1 with a shared > L2, seemed like a better option than cores with a very large shared L1 > cache. You appear to be "starting to get it"; congratulations. > I am not sure that abandoning a global address space is such a great > idea, as a lot of the "merits" can be gained instead by using weak > coherence models (possibly with a shared 256K or 512K or so for each > group of 4 cores, at which point it goes out to a higher latency global > bus). In this case, the division into independent memory regions could > be done in software. Most of the last 50 years has been towards a single global address space. > It is unclear if my approach is "sufficiently minimal". There is more > complexity than I would like in my ISA (and effectively turning it into > the common superset of both my original design and RV64G, doesn't really > help matters here). > If going for a more minimal core optimized for perf/area, some stuff > might be dropped. Would likely drop integer and floating-point divide I think this is pound foolish even if penny wise. > again. Might also make sense to add an architectural zero register, and > eliminate some number of encodings which exist merely because of the > lack of a zero register (though, encodings are comparably cheap, as the I got an effective zero register without having to waste a register name to "get it". My 66000 gives you 32 registers of 64-bits each and you can put any bit pattern in any register and treat it as you like. Accessing #0 takes 1/16 of a 5-bit encoding space, and is universally available. > internal uArch has a zero register, and effectively treats immediate > values as a special register as well, ...). Some of the debate is more > related to the logic cost of dealing with some things in the decoder. The problem is universal constants. RISCs being notably poor in their support--however this is better than addressing modes which require µCode. > Though, would likely still make a few decisions differently from those > in RISC-V. Things like indexed load/store, Absolutely > predicated ops (with a > designated flag bit), Predicated then and else clauses which are branch free. {{Also good for constant time crypto in need of flow control...}} > and large-immediate encodings, Nothing else is so poorly served in typical ISAs. > help enough with > performance (relative to cost) +40% > to be worth keeping (though, mostly > because the alternatives are not so good in terms of performance). Damage to pipeline ability less than -5%.