Deutsch English Français Italiano |
<bcbda29c4c23543d1ed6de8290d1dc3b@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Making Lemonade (Floating-point format changes) Date: Mon, 13 May 2024 23:25:25 +0000 Organization: Rocksolid Light Message-ID: <bcbda29c4c23543d1ed6de8290d1dc3b@www.novabbs.org> References: <abe04jhkngt2uun1e7ict8vmf1fq8p7rnm@4ax.com> <memo.20240512203459.16164W@jgd.cix.co.uk> <v1rab7$2vt3u$1@dont-email.me> <20240513151647.0000403f@yahoo.com> <v1tre1$3leqn$1@dont-email.me> <9c79fb24a0cf92c5fac633a409712691@www.novabbs.org> <v1u6oi$3o53t$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="1020579"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Rslight-Site: $2y$10$lb0GRtRN4zpjujpyeP1aTeUY59zMcnkAlUzVjKGa4TUJkXKWWgttq X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 4106 Lines: 78 BGB wrote: > On 5/13/2024 4:16 PM, MitchAlsup1 wrote: >> BGB wrote: >> >>> >>> Emulation via traps is very slow, but typical for many ISA's is to >>> just quietly turn the soft-float operations into runtime calls. >> >> I recall that MIPS could emulate a TLB table walk in something like >> 19 cycles. That is:: a few cycles to get there, a hash table access, >> a check, a TLB install, and a few cycles to get back. >> >> On an x86 this would be at least 200 cycles just getting there and back. >> > I guess there are different possibilities here... > Trap cost can be reduced, say, by having banked registers. > But, not so good with explicit save/restore and a large register file. > For example, I can note that a MSP430 at 16MHz can service a 32kHz > timer... (with a budget of 488 cycles per interrupt). > But, my BJX2 core (at 50MHz) would have a harder time here, with around > a 1.5k cycle budget... > Then again, it is possible the per-interrupt overhead would go down > slightly, since most likely the ISR stack will still be in the L1 cache > between interrupts (and save/restore overhead should drop to ~ 100 > cycles in the absence of L1 misses). > MSP430 had a slight advantage here (besides fewer registers) in that L1 > misses are not a thing (so, memory access has constant latency). >> So, to revisit your statement:: >> >> Emulation is slow when trap overhead is large and not-slow when trap >> overhead >> is small. > Possible, but I would not expect trap overhead to be lower than runtime > call overhead... Yes, of course, trapping can never be quite as inexpensive as a CALL/RET sequence. But it does not have to be much larger--just a little bit larger. > Also (in my case): > Debugging is rather annoying in cases where dealing with bugs > appear/disappear/move around at random or with the slightest > perturbation... You need better verification--Oh Wait ... > But, given for the most part behavior is consistently buggy (and > manifesting in seemingly the same ways) between both the emulator and > Verilog implementation, this implies the causal factors are in software. > I guess in this case, either I figure it out, or will need to again go > back to cooperative scheduling. Seemingly, using preemptive scheduling > and virtual memory at the same time is particularly unstable (programs > tend to crash on startup or soon after). > Also I may need to rework how page-in/page-out is handled (and or how IO > is handled in general) since if a page swap needs to happen while IO is > already in progress (such as a page-miss in the system-call process), at > present, the OS is dead in the water (one can't access the SDcard in the > middle of a different access to the SDcard). Having a HyperVisor helps a lot here, with HV taking the page faults of the OS page fault handler.