Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <vbtnlj$22nu$1@dont-email.me>
Deutsch   English   Français   Italiano  
<vbtnlj$22nu$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Robert Finch <robfi680@gmail.com>
Newsgroups: comp.arch
Subject: Re: Tonights Tradeoff
Date: Wed, 11 Sep 2024 23:37:22 -0400
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <vbtnlj$22nu$1@dont-email.me>
References: <vbgdms$152jq$1@dont-email.me>
 <17537125c53e616e22f772e5bcd61943@www.novabbs.org>
 <vbj5af$1puhu$1@dont-email.me>
 <a37e9bd652d7674493750ccc04674759@www.novabbs.org>
 <vbog6d$2p2rc$1@dont-email.me> <vboqpp$2r5v4$1@dont-email.me>
 <vbpmqr$30vto$1@dont-email.me> <vbqcds$35l1q$2@dont-email.me>
 <vbs7ff$3koub$1@dont-email.me> <vbse3j$f01n$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 12 Sep 2024 05:37:24 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="cda5ef7a46dacac920df66c1f90c09de";
	logging-data="68350"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19BKTmvxgeCGHYqKbuuGnN4K2DKZDHQ89Q="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:J7ouAptdvBZ0PeHhXa0afM9L19Y=
Content-Language: en-US
In-Reply-To: <vbse3j$f01n$2@dont-email.me>

On 2024-09-11 11:48 a.m., Stephen Fuld wrote:
> On 9/11/2024 6:54 AM, Robert Finch wrote:
> 
> snip
> 
> 
>> I have found that there can be a lot of registers available if they 
>> are implemented in BRAMs. BRAMs have lots of depth compared to LUT 
>> RAMs. BRAMs have a one cycle latency but that is just part of the 
>> pipeline. In Q+ about 40k LUTs are being used just to keep track of 
>> registers. (rename mappings and checkpoints).
>>
>> Given a lot of available registers I keep considering trying a VLIW 
>> design similar to the Itanium, rotating register and all. But I have a 
>> lot invested in OoO.
>>
>>
>> Q+ has seven in-order pipeline stages before things get to the re- 
>> order buffer. 
> 
> Does each of these take a clock cycle?  If so, that seems excessive. 
> What is your cost for a mis-predicted branch?
> 
> 
> 
> 
Each stage takes one clock cycle. Unconditional branches are detected at 
the second stage and taken then so they do not consume as many clocks. 
There are two extra stages to handle vector instructions. Those two 
stages could be removed if vectors are not needed.

Mis-predicted branches are really expensive. They take about six clocks, 
plus the seven clocks to refill the pipeline, so it is about 13 clocks. 
Seems like it should be possible to reduce the number of clocks of 
processing during the miss, but I have not got around to it yet. There 
is a branch miss state machine that restores the checkpoint. Branches 
need a lot of work yet.

I am not sure how well the branch prediction works. Instruction runs in 
SIM are not long enough yet. Something in the AGEN/TLB/LSQ is not 
working correctly yet, leading to bad memory cycles.