Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.lang.c
Subject: Re: Top 10 most common hard skills listed on resumes...
Date: Fri, 13 Sep 2024 16:25:44 +0200
Organization: A noiseless patient Spider
Lines: 243
Message-ID: <vc1i18$ti1v$2@dont-email.me>
References: <vab101$3er$1@reader1.panix.com>
 <vakjff$30c4f$1@raubtier-asyl.eternal-september.org>
 <val7d6$33e83$1@dont-email.me>
 <vamb0t$3btll$2@raubtier-asyl.eternal-september.org>
 <vamqfc$3e42u$1@dont-email.me> <20240828134956.00006aa3@yahoo.com>
 <van4v1$3fgjj$1@dont-email.me> <vbl591$286j0$1@paganini.bofh.team>
 <vbmfee$2bn2v$3@dont-email.me> <vbn15a$2fvl0$1@paganini.bofh.team>
 <vbn376$2empp$1@dont-email.me> <vbo23j$2hibc$1@paganini.bofh.team>
 <vbp31o$2sqpd$1@dont-email.me> <vc078u$3i3kh$1@paganini.bofh.team>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 13 Sep 2024 16:25:45 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="62baffe00b005c9dd99479aa197dcc2f";
	logging-data="968767"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18ITcHhfV+H681ZP+rZOVR5bCtMoLcnA6I="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.11.0
Cancel-Lock: sha1:NW4RNV+VK9r7/ID3oTWUq/sgE6U=
In-Reply-To: <vc078u$3i3kh$1@paganini.bofh.team>
Content-Language: en-GB
Bytes: 13275

On 13/09/2024 04:16, Waldek Hebisch wrote:
> David Brown <david.brown@hesbynett.no> wrote:
>> On 10/09/2024 01:58, Waldek Hebisch wrote:
>>> David Brown <david.brown@hesbynett.no> wrote:
>>>> On 09/09/2024 16:36, Waldek Hebisch wrote:

(I'm snipping bits, because these posts are getting a bit long!)

> 
>> Context is everything.  That is why I have been using the term
>> "cpu cache" for the cache tied tightly to the cpu itself, which comes as
>> part of the core that ARM designs and delivers, along with parts such as
>> the NVIC.
> 
> Logically, given that there was "tightly attached memory", this should
> be called "tightly attached cache" :)
> 
> Logicallt "cpu cache" is cache sitting on path between the CPU and a memory
> device.  It does not need to be tightly attached to the CPU, L2 and L3
> caches in PC-s are not "tightly attached".
> 

That is all true.  But in the case of ARM Cortex-M microcontrollers, the 
cpu cache is part of the "black box" delivered by ARM.  Manufacturers 
get some choices when they order the box, including some influence over 
the cache sizes, but it is very much integrated in the core complex 
(along with the NVIC and a number of other parts).  It is completely 
irrelevant that on a Pentium PC, the cache was a separate chip, or that 
on a PowerPC microcontroller the interrupt controller is made by the 
microcontroller manufacturer and not by the cpu core designers.  On 
microcontrollers built around ARM Cortex-M cores, ARM provides the cpu 
core, cpu caches (depending on the core model and options chosen), the 
NVIC interrupt controller, MPU, and a few other bits and pieces.  The 
caches are called "cpu caches" - "cpu data cache" and "cpu instruction 
cache" because they are attached to the cpu.  The microcontroller 
manufacturer can put whatever else they like on the chip.


I don't disagree that other types of buffer can fall under a generic 
concept of "cache".  And in some cases they may even have the same 
logical build-up as a tiny and limited version of the cpu caches.  But I 
don't think it helps to use exactly the same terms for things that are 
in significantly different places on the chip, with very different 
balances in their designs, and for different effects in their detailed 
working.

It is fair enough to talk about a "flash cache" for buffers that are 
designed somewhat like a cache, with at least two entries indexed by an 
address hash (usually just some of the lower address bits), and with 
tags including the rest of the address bits.  It is misleading for 
systems where you just have a read-ahead buffer or two, or a queue 
system.  Unlike the cpu caches, the architecture of such flash 
accelerators varies wildly for different manufacturers and their 
different microcontroller models.


>>   And I have tried to use terms such as "buffer" or "flash
>> controller cache" for the memory buffers often provided as part of flash
>> controllers and memory interfaces on microcontrollers, because those are
>> terms used by the microcontroller manufacturers.
> 
> "flash cache" looks resonable.  Concerning difference between a buffer
> and a cache there is indeed some fuzzines here.  AFAICS word "buffer"
> is used for logically very simple devices, once operation becomes a bit
> more interesting it is usually called a cache.  Anyway, given fuzzines
> saying that something called a buffer is not a cache is risky, it
> may have all features associalted normally with caches, and in such
> a case deserves to be called a cache.
> 

OK, but it is not a "cpu cache".


>>> But
>>> if you look at devices where bus matrix runs at the same clock
>>> as the core, then it makes sense to put cache on the other side.
>>
>> No.
>>
>> You put caches as close as possible to the prime user of the cache.  If
>> the prime user is the cpu and you want to cache data from flash,
>> external memory, and other sources, you put the cache tight up against
>> the cpu - then you can have dedicated, wide, fast buses to the cpu.
> 
> I would say that there is a tradeoff between cost and effect.  And
> there is question of technical possibility.  For example, 386 was
> sold as a chip, and all that a system designer could do was to put
> a cache ont the motherboard.  On chip cache would be better, but was
> not possible.

There can certainly be such trade-offs.  I don't remember the details of 
the 386, but I /think/ the cache was connected separately on a dedicated 
bus, rather than on the bus that went to the memory controller (which 
was also off-chip, on the chipset).  So it was logically close to the 
cpu even though it was physically on a different chip.  I think if these 
sorts of details are of interest, a thread in comp.arch might make more 
sense than comp.lang.c.

>  IIUC in case of Cortex-M0 or say M4 manufactures get
> ARM core with busses intended to be connected to the bus matrix.

Yes.

> Manufacturs could add extra bus matrix or crossbar just to access cache,
> but bus width is specified by ARM design.  

I believe the bus standard is from ARM, but the implementation is by the 
manufacturers (unlike the cpu core and immediately surrounding parts, 
including the cpu caches for devices that support that).

> If main bus matrix and RAM
> is clocked at CPU freqency the extra bus matrix and cache would
> only add extra latency for no gain (of course, this changes when
> main bus matrix runs at lower clock).

Correct, at least for static RAM (DRAM can have more latency than a 
cache even if it is at the same base frequency).  cpu caches are useful 
when the onboard ram is slower than the cpu, and particularly when 
slower memory such as flash or external ram are used.

>  So putting cache only
> at flash interface makes sense: it helps there and on lower end
> chips is not needed elswere.

Yes.

>  Also, concerning caches in MCU-s note
> that for writable memory there is problem of cache coherency.  In
> particular several small MCU-s have DMA channels.  Non-coherent design
> would violate user expectations and would be hard to use.

That is correct.  There are three main solutions to this in any system 
with caches.  One is to have cache snooping for the DMA controller so 
that the cpu and the DMA have the same picture of real memory.  Another 
is to have some parts of the ram as being uncached (this is usually 
controlled by the MMU), so that memory that is accessed by the DMA is 
never in cache.  And the third method is to use cache flush and 
invalidate instructions appropriately so that software makes sure it has 
up-to-date data.  I've seen all three - and on some microcontrollers I 
have seen a mixture in use.  Obviously they have their advantages and 
disadvantages in terms of hardware or software complexity.

>  OTOH putting
> coherent cache on memory side means extra complication to bus matrix (I
> do not know what ARM did with their bigger cores).  Flash being
> mainly read-only does not have this problem.
> 

Flash still has such issues during updates.  I've seen badly made 
systems where things like the flash status register got cached. 
Needless to say, that did not work well!  And if you have a bigger 
instruction cache, you have to take care to flush things appropriately 
during software updates.


>> But it can also make sense to put small buffers as part of memory
>> interface controllers.  These are not organized like data or instruction
>> caches, but are specific for the type of memory and the characteristics
>> of it.
> 
> Point is that in many cases they are organized like classic caches.
> They cover only flash, but how it is different from caches in PC-s
> that covered only part of possible RAM?
> 

The main differences are the dimensions of the caches, their physical 
and logical location, and the purpose for which they are optimised.

>>   How this is done depends on details of the interface, details of
>> the internal buses, and how the manufacturer wants to implement it.  For
========== REMAINDER OF ARTICLE TRUNCATED ==========