Article <uscf92$12ifq$1@dont-email.me>

Deutsch English Français Italiano
<uscf92$12ifq$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Robert Finch <robfi680@gmail.com>
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Thu, 7 Mar 2024 08:25:53 -0500
Organization: A noiseless patient Spider
Lines: 323
Message-ID: <uscf92$12ifq$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me> <us3l9r$2vtrd$1@dont-email.me>
 <CxkFN.164321$JLvf.86786@fx44.iad> <us6dvv$3kp3g$1@dont-email.me>
 <95f07d18ea021f53af50c0bf2064ccdf@www.novabbs.org>
 <us7hu4$3qpum$1@dont-email.me> <us7neb$3sv8c$1@dont-email.me>
 <us81je$3v8p5$1@dont-email.me> <us8g78$1rva$1@dont-email.me>
 <dec95c54e6adf32bdcd478f079745e86@www.novabbs.org>
 <us9ffo$argb$1@dont-email.me> <us9vc5$ffl5$1@dont-email.me>
 <usaciv$ibv9$1@dont-email.me>
 <e7e6d152876f385e78404b06eef87121@www.novabbs.org>
 <usbngu$tq9s$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 7 Mar 2024 13:25:55 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="62bf69830fd11e8a0b21b52a939d9fa6";
	logging-data="1133050"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+VkEjdZ8cNTYzzFd9DnHkO7doMip8ct/I="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:XQn+qu6eIJuIb2IEMQXn07sNbPI=
In-Reply-To: <usbngu$tq9s$1@dont-email.me>
Content-Language: en-US
Bytes: 14316

On 2024-03-07 1:39 a.m., BGB wrote:
> On 3/6/2024 7:28 PM, MitchAlsup1 wrote:
>> BGB wrote:
>>
>>> On 3/6/2024 8:42 AM, Robert Finch wrote:
>>>>
>>
>>
>>> In my case, access is figured out on cache-line fetch, and is precooked:
>>>    NR, NW, NX, NU, NC: NoRead/NoWrite/NoExecute/NoUser/NoCache.
>>> Though, some combinations of these flags are special.
>>
>> Is there a reason these flags (other than user) are inverted ??
>> {{And even noUser can be changed into Super.}}
>>
> 
> Historical quirk...
> 
> Off-hand, I don't remember why it is this way.
> Seems this was one of the parts I designed, but as for why the bits were 
> logically inverted, dunno.
> 
> In terms of the main page flags, they are also inverted. But. in terms 
> of VUGID and ACL checks, they are not inverted.
> 
> 
>> In addition, I think you will want to be able to specify which level of
>> cache {L1, L2, LLC} this line is stored at, prefetched to, and pushed out
>> to.
>>
> 
> Possibly, but not really a thing ATM.
> It mostly effects the L1 cache, and (indirectly) the newer 
> set-associative V$ thing.
> 
> 
>> My 66000 is using ASID instead of something like Super/Global because I
>> don't want to have to flush the TLB on a hypervisor context switch --
>> where one GuestOS Super/Global is not the same as another GuestOSs. 
>> When a GuestOS is accessing one of its user applications, AGEN 
>> automagiaclly
>> uses application AISD instead of GuestOS ASID. {Similar for HV accessing
>> GuestOS -- while switching from 1-level translation to 2-level.
>>
> 
> This is why I have "ASID Groups"...
> 
> If normal processes are in ASID Groups 00..1F, and VM's are in groups 
> 38..3F, then global pages in the normal process groups will not be 
> visible in the VM groups (avoiding the need for a TLB flush).
> 
> But, yeah, I had debated whether or not to not have global pages at all.
> 
> 
>> <snip>
>>
>>> The L1 cache only hits if the current mode matches the mode that was 
>>> in effect at the time the cache-line was fetched, and if KRR has not 
>>> changed (as determined by a hash value), ...
>>
>> s/mode/ASID/
>>
> 
> Both will effect hit/miss in my case.
>    User/Supervisor/ISR;
>    What KRR contains;
>    Which ISA mode is running;
>    ASID;
>    ...
> All these may cause the L1 caches to miss.
> 
> 
>>>> For my system the ACL is not part of the PTE, it is part of the 
>>>> software managed page information, along with share counts. I do not 
>>>> see the ACL for a page being different depending on the page table.
>>>>
>>
>>> In my case, ACL handling is done via a combination of keyring 
>>> register (KRR), and a small fully-associative cache (4 entry at 
>>> present, 6 could be better in theory; luckily each entry is 
>>> comparably small).
>>
>>> The ACLID is tied to the TLBE, so the intersection of the ACLID and 
>>> KRR entry are used to figure out access in the ACL cache (or, 
>>> ignored/disabled if the low 16 bits of KRR are 0).
>>
>>
>>>> I have dedicated some of the block RAMs for the page management 
>>>> information, so they may be read out in parallel with a memory 
>>>> access. So shifted the block RAM usage from the TLB to the PMT. This 
>>>> makes the TLB smaller. It also reduces the memory usage. The page 
>>>> management information only needs one copy for each page of memory. 
>>>> If the information were in the TLBE / PTEs there would be multiple 
>>>> copies of the information in the page tables. How do you keep things 
>>>> coherent if there are multiple copies in page tables?
>>>>
>>
>>
>>> The access ID for pages is kept in sync with the memory address, 
>>> since both are uploaded to the TLB at the same time.
>>
>>> However, as for ACL checks themselves, these are handled with a 
>>> separate cache. So, say, changing the access to an ACLID, and 
>>> flushing the corresponding entry from the ACL cache, will 
>>> automatically apply to any pages previously loaded into the TLB.
>>
>>> There was also the older VUGID system, which used traditional 
>>> Unix-style permissions. If I were designing it now, would likely 
>>> design things around using exclusively ACL checking, which 
>>> (ironically) also needs less bits to encode.
>>
>>
>>
>>> Generally, software TLB miss handling is used in my case.
>>
>>> There is no automatic way to keep the TLB in sync with the page table 
>>> (if the page table entry is modified).
>>
>> My 66000 has a coherent TLB.
>>
>>> Usual thing is that if the current page table is updated, then one 
>>> needs to forge a special dummy entry, and then upload this entry to 
>>> the TLB multiple times (via the LDTLB instruction) to knock the prior 
>>> contents out of the TLB (or use the INVTLB instruction, but this 
>>> currently invalidates the entire TLB; which is a bad situation for 
>>> software-managed TLB...).
>>
>> See how much easier a coherent TLB is ??
>>
> 
> Possible, but generally only the kernel is going to be updating the page 
> tables, and the kernel can know that it needs to invoke a special ritual 
> whenever updating the page table to avoid stale page-table entries being 
> used...
> 
> 
> Meanwhile, like with coherent caches, coherent TLB would require some 
> sort of "spooky action at a distance" (like, somehow, the TLB needs to 
> know that memory corresponding to a particular part of the page-table 
> was updated).
> 
> This is possibly even harder to implement, than something like TSO would 
> be (since, at least with TSO, there is a more obvious correlation 
> between writing to a cache line and needing to have every other copy at 
> the same address first written back to main memory).
> 
> 
> Easier from the hardware design front to throw up ones' hands and be 
> like "Yeah, the OS can deal with it somehow...".
> 
> Nevermind that apparently my coherence model is even weaker than the 
> RISC-V model, as they are like "well, there is a FENCE" instruction, and 
> I am left not having any good idea with how to deal with it either than 
> "trap and let the trap-handler sort it out..." (presumably by flushing 
> the L1 caches...).
> 
> Granted this is a crap solution...
> 
> Granted, it appears that "Trap and flush the L1 caches" is still a valid 
> implementation strategy for "Zifencei".
> 
> 
>>> Generally, the assumption is that all pages in a mapping will have 
>>> the same ACLID (generally corresponding to the "owner" of the mapping).
>>
>> An unsupported assumption if one wants to keep LB flushes minimized.
>>
> 
> Possible, but this is more for the OS to care about.
> The hardware doesn't care either way.
========== REMAINDER OF ARTICLE TRUNCATED ==========