Deutsch   English   Français   Italiano  
<vc078u$3i3kh$1@paganini.bofh.team>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!feeds.phibee-telecom.net!2.eu.feeder.erje.net!feeder.erje.net!newsfeed.bofh.team!paganini.bofh.team!not-for-mail
From: Waldek Hebisch <antispam@fricas.org>
Newsgroups: comp.lang.c
Subject: Re: Top 10 most common hard skills listed on resumes...
Date: Fri, 13 Sep 2024 02:16:00 -0000 (UTC)
Organization: To protect and to server
Message-ID: <vc078u$3i3kh$1@paganini.bofh.team>
References: <vab101$3er$1@reader1.panix.com>   <vakjff$30c4f$1@raubtier-asyl.eternal-september.org> <val7d6$33e83$1@dont-email.me> <vamb0t$3btll$2@raubtier-asyl.eternal-september.org> <vamqfc$3e42u$1@dont-email.me> <20240828134956.00006aa3@yahoo.com> <van4v1$3fgjj$1@dont-email.me> <vbl591$286j0$1@paganini.bofh.team> <vbmfee$2bn2v$3@dont-email.me> <vbn15a$2fvl0$1@paganini.bofh.team> <vbn376$2empp$1@dont-email.me> <vbo23j$2hibc$1@paganini.bofh.team> <vbp31o$2sqpd$1@dont-email.me>
Injection-Date: Fri, 13 Sep 2024 02:16:00 -0000 (UTC)
Injection-Info: paganini.bofh.team; logging-data="3739281"; posting-host="WwiNTD3IIceGeoS5hCc4+A.user.paganini.bofh.team"; mail-complaints-to="usenet@bofh.team"; posting-account="9dIQLXBM7WM9KzA+yjdR4A";
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (Linux/6.1.0-9-amd64 (x86_64))
X-Notice: Filtered by postfilter v. 0.9.3
Bytes: 20635
Lines: 390

David Brown <david.brown@hesbynett.no> wrote:
> On 10/09/2024 01:58, Waldek Hebisch wrote:
>> David Brown <david.brown@hesbynett.no> wrote:
>>> On 09/09/2024 16:36, Waldek Hebisch wrote:
>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>> On 08/09/2024 23:34, Waldek Hebisch wrote:
>>>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>>>>
>>>>>>> And while microcontrollers sometimes have a limited form of branch
>>>>>>> prediction (such as prefetching the target from cache), the more
>>>>>>> numerous and smaller devices don't even have instruction caches.
>>>>>>> Certainly none of them have register renaming or speculative execution.
>>>>>>
>>>>>> IIUC STM4 series has cache, and some of them are not so big.  There
>>>>>> are now several chinese variants of STM32F103 and some of them have
>>>>>> caches (some very small like 32 words, IIRC one has 8 words and it
>>>>>> is hard to decide if this very small cache or big prefetch buffer).
>>>>>
>>>>> There are different kinds of cache here.  Some of the Cortex-M cores
>>>>> have optional caches (i.e., the microcontroller manufacturer can choose
>>>>> to have them or not).
>>>>>
>>>>> <https://en.wikipedia.org/wiki/ARM_Cortex-M#Silicon_customization>
>>>>
>>>> I do not see relevent information at that link.
>>>
>>> There is a table of the Cortex-M cores, with the sizes of the optional
>>> caches.
>>>
>>>>    
>>>>> Flash memory, flash controller peripherals, external memory interfaces
>>>>> (including things like QSPI) are all specific to the manufacturer,
>>>>> rather than part of the Cortex M cores from ARM.  Manufacturers can do
>>>>> whatever they want there.
>>>>
>>>> AFAIK typical Cortex-M design has core connected to "bus matrix".
>>>> It is up to chip vendor to decide what else is connected to bus matrix.
>>>
>>> Yes.
>>>
>>> However, there are other things connected before these crossbar
>>> switches, such as tightly-coupled memory (if any).
>> 
>> TCM is _not_ a cache.
>> 
> 
> Correct.  (I did not suggest or imply that it was.)
> 
>>>   And the cpu caches
>>> (if any) are on the cpu side of the switches.
>> 
>> Caches are attached were system designer thinks they are useful
>> (and possible).  Word "cache" has well-estabished meaning and
>> ARM (or you) has no right to redefine it.
>> 
> 
> I am using it in the manner ARM uses it when talking about ARM 
> processors and microcontroller cores.  I think that is the most relevant 
> way to use the term here.  The term "cache" has many meanings in many 
> contexts - there is no single precise "well-established" or "official" 
> meaning.

It is "well-established" that meaning is very inclusive, like definition
here:

<https://en.wikipedia.org/wiki/Cache_(computing)>


> Context is everything.  That is why I have been using the term 
> "cpu cache" for the cache tied tightly to the cpu itself, which comes as 
> part of the core that ARM designs and delivers, along with parts such as 
> the NVIC.

Logically, given that there was "tightly attached memory", this should
be called "tightly attached cache" :)

Logicallt "cpu cache" is cache sitting on path between the CPU and a memory
device.  It does not need to be tightly attached to the CPU, L2 and L3
caches in PC-s are not "tightly attached".

>  And I have tried to use terms such as "buffer" or "flash 
> controller cache" for the memory buffers often provided as part of flash 
> controllers and memory interfaces on microcontrollers, because those are 
> terms used by the microcontroller manufacturers.

"flash cache" looks resonable.  Concerning difference between a buffer
and a cache there is indeed some fuzzines here.  AFAICS word "buffer"
is used for logically very simple devices, once operation becomes a bit
more interesting it is usually called a cache.  Anyway, given fuzzines
saying that something called a buffer is not a cache is risky, it
may have all features associalted normally with caches, and in such
a case deserves to be called a cache.

>>>   Manufacturers also have a
>>> certain amount of freedom of the TCMs and caches, depending on which
>>> core they are using and which licenses they have.
>>>
>>> There is a convenient diagram here:
>>>
>>> <https://www.electronicdesign.com/technologies/embedded/digital-ics/processors/microcontrollers/article/21800516/cortex-m7-contains-configurable-tightly-coupled-memory>
>>>
>>>> For me it does not matter if it is ARM design or vendor specific.
>>>> Normal internal RAM is accessed via bus matrix, and in MCU-s that
>>>> I know about is fast enough so that cache is not needed.  So caches
>>>> come into play only for flash (and possibly external memory, but
>>>> design with external memory probably will be rather large).
>>>>
>>>
>>> Typically you see data caches on faster Cortex-M4 microcontrollers with
>>> external DRAM, and it is also standard on Cortex-M7 devices.  For the
>>> faster chips, internal SRAM on the AXI bus is not fast enough.  For
>>> example, the NXP i.mx RT106x family typically run at 528 MHz core clock,
>>> but the AXI bus and cross-switch are at 133 MHz (a quarter of the
>>> speed).  The tightly-coupled memories and the caches run at full core speed.
>> 
>> OK, if you run core at faster clock than the bus matrix, then cache
>> attached on core side make a lot of sense.  And since cache has to
>> compensate for lower bus speed it must be resonably large.  
> 
> Yes.
> 
>> But
>> if you look at devices where bus matrix runs at the same clock
>> as the core, then it makes sense to put cache on the other side.
> 
> No.
> 
> You put caches as close as possible to the prime user of the cache.  If 
> the prime user is the cpu and you want to cache data from flash, 
> external memory, and other sources, you put the cache tight up against 
> the cpu - then you can have dedicated, wide, fast buses to the cpu.

I would say that there is a tradeoff between cost and effect.  And
there is question of technical possibility.  For example, 386 was
sold as a chip, and all that a system designer could do was to put
a cache ont the motherboard.  On chip cache would be better, but was
not possible.  IIUC in case of Cortex-M0 or say M4 manufactures get
ARM core with busses intended to be connected to the bus matrix.
Manufacturs could add extra bus matrix or crossbar just to access cache,
but bus width is specified by ARM design.  If main bus matrix and RAM
is clocked at CPU freqency the extra bus matrix and cache would
only add extra latency for no gain (of course, this changes when
main bus matrix runs at lower clock).  So putting cache only
at flash interface makes sense: it helps there and on lower end
chips is not needed elswere.  Also, concerning caches in MCU-s note
that for writable memory there is problem of cache coherency.  In
particular several small MCU-s have DMA channels.  Non-coherent design
would violate user expectations and would be hard to use.  OTOH putting
coherent cache on memory side means extra complication to bus matrix (I
do not know what ARM did with their bigger cores).  Flash being
mainly read-only does not have this problem.

> But it can also make sense to put small buffers as part of memory 
> interface controllers.  These are not organized like data or instruction 
> caches, but are specific for the type of memory and the characteristics 
> of it.

Point is that in many cases they are organized like classic caches.
They cover only flash, but how it is different from caches in PC-s
that covered only part of possible RAM?

>  How this is done depends on details of the interface, details of 
> the internal buses, and how the manufacturer wants to implement it.  For 
> example, on one microcontroller I am using there are queues to let it 
> accept multiple flash read/write commands from the AHB bus and the IPS 
> bus, but read-ahead is controlled by the burst length of read requests 
> from the cross-switch (which in turn will come from cache line fill 
> requests from the cpu caches).  On a different microcontroller, the 
> read-ahead logic is in the flash controller itself as that chip has a 
> simpler internal bus where all read requests will be for 32 bits (it has 
> no cpu caches).  An external DRAM controller, on the other hand, will 
> have queues and buffers optimised for multiple smaller transactions and 
> be able to hold writes in queues that get lower priority than read requests.
> 
> These sorts of queues and buffers are not generally referred to as 
> "caches", because they are specialised queues and buffers.  Sometimes 
> you might have something that is in effect perhaps a two-way 
> single-entry 16 byte wide read-only cache, but using the term "cache" 
> here is often confusing.  At best it is a "flash controller cache", and 
> very distinct from a "cpu cache".

From STM32F400 reference manual:

: Instruction cache memory
:
========== REMAINDER OF ARTICLE TRUNCATED ==========