Deutsch   English   Français   Italiano  
<vd6a5e$o0aj$2@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!npeer.as286.net!npeer-ng0.as286.net!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Robert Finch <robfi680@gmail.com>
Newsgroups: comp.arch
Subject: Re: Tonights Tradeoff - Background Execution Buffers
Date: Fri, 27 Sep 2024 08:58:21 -0400
Organization: A noiseless patient Spider
Lines: 77
Message-ID: <vd6a5e$o0aj$2@dont-email.me>
References: <vbgdms$152jq$1@dont-email.me>
 <17537125c53e616e22f772e5bcd61943@www.novabbs.org>
 <vbj5af$1puhu$1@dont-email.me>
 <a37e9bd652d7674493750ccc04674759@www.novabbs.org>
 <vbog6d$2p2rc$1@dont-email.me>
 <f2d99c60ba76af28c8b63b9628fb56fa@www.novabbs.org>
 <vc61e6$21skv$1@dont-email.me> <vc8gl4$2m5tp$1@dont-email.me>
 <vcv5uj$3arh6$1@dont-email.me>
 <37067f65c5982e4d03825b997b23c128@www.novabbs.org>
 <vd352q$3s1e$1@dont-email.me>
 <5f8ee3d3b2321ffa7e6c570882686b57@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 27 Sep 2024 14:58:23 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7c5098c7b1f41ab4d55ddf5c27ceca77";
	logging-data="786771"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18R3ME+Wlz7zYNogGA7f63xA6rCqZ15T/Q="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:ZuG++tUbzfKz8wyZ3m4qksCyykg=
Content-Language: en-US
In-Reply-To: <5f8ee3d3b2321ffa7e6c570882686b57@www.novabbs.org>
Bytes: 5291

On 2024-09-26 10:11 a.m., MitchAlsup1 wrote:
> On Thu, 26 Sep 2024 8:13:12 +0000, Robert Finch wrote:
> 
>> On 2024-09-24 4:38 p.m., MitchAlsup1 wrote:
>>> On Tue, 24 Sep 2024 20:03:29 +0000, Robert Finch wrote:
>>>
>>>> Under construction: Q+ background execution buffers for the block 
>>>> memory
>>>> operations. For instance, a block store operation can be executed in 
>>>> the
>>>> background while other instructions are executing. Store operations are
>>>> issued when the MEM unit is not busy. Background instructions continue
>>>> to execute even when interrupts occur. The background operations may be
>>>> useful for initializing blocks of memory that are not needed right- 
>>>> away.
>>>> When the operation is issued a handle for the buffer is returned in the
>>>> destination register so that the status of the operation may be 
>>>> queried,
>>>> or the operation cancelled.
>>>
>>> This is how My 66000 performs:: LDM, STM, ENTER, EXIT, MM, and MS.
>>> Addresses are AGENED and then a state machine over in the memory
>>> unit performs the required steps. {{Not usefully different than the
>>> divider performing the individual steps of division.}} While the
>>> unit performs its duties, other units can be fed and complete
>>> other instructions.
>>>
>>> You just have to mark the affected registers to prevent hazards.
>>
>> Q+ releases the registers right away, so things can continue on.
>> Q+ captures the register values at issue then does not modify the
>> registers. Did not want an instruction with three updates happening. It
>> keeps track of its own values. In theory anyway. Have not got to testing
>> it yet. A status operation might be used to query the final operation
>> results.
>>
>> Altering Q+ to use 64-bit instructions and 256 registers instead of
>> supporting a vector instruction set. Two pipeline stages can be removed
>> then and it is a simpler design. Code density will decrease <200%.
>> Relying on software to assign registers for vectors.
>>
>> Also adding a predicate field to instructions. Branches are horrendously
>> slow in this simple implementation. It may be faster to predicate a
>> dozen instructions.
> 
> The depth of predication should be such that if FETCH will "get there"
> by the time the branch "resolves" that number of instructions should
> be predicated.

******

The circular list namer used to supply register tags turns out to impact 
performance more than anticipated. It stalled the machine 200+ times in 
3,000 instructions, costing almost 10% in performance. It looked great 
on short runs of instructions, but being able to get longer runs, not so 
good. I had guessed that stalls would be a fraction of a percent.

So, a fifo based name supplier was written, it stalled the machine 22 
times in 1,600 instructions. Much better than the circular list. But 
there may be yet a bug in the name supplier as far as I know it should 
not stall the machine at all, as only available registers should be in 
the fifo. There is a check to ensure that the register tag from the fifo 
is in fact an available one, and that seems to be failing occasionally. 
Still tons of bugs in the CPU.

There was a trick to the fifo based renamer. Four fifos are each loaded 
with ¼ of the register tags at reset. But when the registers are freed 
up the tags could be placed on any fifo. There was the potential that 
one fifo would get all the free registers. So, a rotator was placed on 
the fifo inputs to try and distribute the free registers evenly amongst 
the fifos. Fifos had to be a power of two in size, so each can hold all 
the register tags. The CL renamer has a lot simpler structure.