Deutsch English Français Italiano |
<vd6a5e$o0aj$2@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!npeer.as286.net!npeer-ng0.as286.net!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Robert Finch <robfi680@gmail.com> Newsgroups: comp.arch Subject: Re: Tonights Tradeoff - Background Execution Buffers Date: Fri, 27 Sep 2024 08:58:21 -0400 Organization: A noiseless patient Spider Lines: 77 Message-ID: <vd6a5e$o0aj$2@dont-email.me> References: <vbgdms$152jq$1@dont-email.me> <17537125c53e616e22f772e5bcd61943@www.novabbs.org> <vbj5af$1puhu$1@dont-email.me> <a37e9bd652d7674493750ccc04674759@www.novabbs.org> <vbog6d$2p2rc$1@dont-email.me> <f2d99c60ba76af28c8b63b9628fb56fa@www.novabbs.org> <vc61e6$21skv$1@dont-email.me> <vc8gl4$2m5tp$1@dont-email.me> <vcv5uj$3arh6$1@dont-email.me> <37067f65c5982e4d03825b997b23c128@www.novabbs.org> <vd352q$3s1e$1@dont-email.me> <5f8ee3d3b2321ffa7e6c570882686b57@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Fri, 27 Sep 2024 14:58:23 +0200 (CEST) Injection-Info: dont-email.me; posting-host="7c5098c7b1f41ab4d55ddf5c27ceca77"; logging-data="786771"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18R3ME+Wlz7zYNogGA7f63xA6rCqZ15T/Q=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:ZuG++tUbzfKz8wyZ3m4qksCyykg= Content-Language: en-US In-Reply-To: <5f8ee3d3b2321ffa7e6c570882686b57@www.novabbs.org> Bytes: 5291 On 2024-09-26 10:11 a.m., MitchAlsup1 wrote: > On Thu, 26 Sep 2024 8:13:12 +0000, Robert Finch wrote: > >> On 2024-09-24 4:38 p.m., MitchAlsup1 wrote: >>> On Tue, 24 Sep 2024 20:03:29 +0000, Robert Finch wrote: >>> >>>> Under construction: Q+ background execution buffers for the block >>>> memory >>>> operations. For instance, a block store operation can be executed in >>>> the >>>> background while other instructions are executing. Store operations are >>>> issued when the MEM unit is not busy. Background instructions continue >>>> to execute even when interrupts occur. The background operations may be >>>> useful for initializing blocks of memory that are not needed right- >>>> away. >>>> When the operation is issued a handle for the buffer is returned in the >>>> destination register so that the status of the operation may be >>>> queried, >>>> or the operation cancelled. >>> >>> This is how My 66000 performs:: LDM, STM, ENTER, EXIT, MM, and MS. >>> Addresses are AGENED and then a state machine over in the memory >>> unit performs the required steps. {{Not usefully different than the >>> divider performing the individual steps of division.}} While the >>> unit performs its duties, other units can be fed and complete >>> other instructions. >>> >>> You just have to mark the affected registers to prevent hazards. >> >> Q+ releases the registers right away, so things can continue on. >> Q+ captures the register values at issue then does not modify the >> registers. Did not want an instruction with three updates happening. It >> keeps track of its own values. In theory anyway. Have not got to testing >> it yet. A status operation might be used to query the final operation >> results. >> >> Altering Q+ to use 64-bit instructions and 256 registers instead of >> supporting a vector instruction set. Two pipeline stages can be removed >> then and it is a simpler design. Code density will decrease <200%. >> Relying on software to assign registers for vectors. >> >> Also adding a predicate field to instructions. Branches are horrendously >> slow in this simple implementation. It may be faster to predicate a >> dozen instructions. > > The depth of predication should be such that if FETCH will "get there" > by the time the branch "resolves" that number of instructions should > be predicated. ****** The circular list namer used to supply register tags turns out to impact performance more than anticipated. It stalled the machine 200+ times in 3,000 instructions, costing almost 10% in performance. It looked great on short runs of instructions, but being able to get longer runs, not so good. I had guessed that stalls would be a fraction of a percent. So, a fifo based name supplier was written, it stalled the machine 22 times in 1,600 instructions. Much better than the circular list. But there may be yet a bug in the name supplier as far as I know it should not stall the machine at all, as only available registers should be in the fifo. There is a check to ensure that the register tag from the fifo is in fact an available one, and that seems to be failing occasionally. Still tons of bugs in the CPU. There was a trick to the fifo based renamer. Four fifos are each loaded with ¼ of the register tags at reset. But when the registers are freed up the tags could be placed on any fifo. There was the potential that one fifo would get all the free registers. So, a rotator was placed on the fifo inputs to try and distribute the free registers evenly amongst the fifos. Fifos had to be a power of two in size, so each can hold all the register tags. The CL renamer has a lot simpler structure.