Path: ...!2.eu.feeder.erje.net!feeder.erje.net!newsfeed.bofh.team!paganini.bofh.team!not-for-mail From: Waldek Hebisch Newsgroups: comp.lang.c Subject: Re: Top 10 most common hard skills listed on resumes... Date: Mon, 9 Sep 2024 14:36:28 -0000 (UTC) Organization: To protect and to server Message-ID: References: <20240828134956.00006aa3@yahoo.com> Injection-Date: Mon, 9 Sep 2024 14:36:28 -0000 (UTC) Injection-Info: paganini.bofh.team; logging-data="2621088"; posting-host="WwiNTD3IIceGeoS5hCc4+A.user.paganini.bofh.team"; mail-complaints-to="usenet@bofh.team"; posting-account="9dIQLXBM7WM9KzA+yjdR4A"; User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (Linux/6.1.0-9-amd64 (x86_64)) X-Notice: Filtered by postfilter v. 0.9.3 Bytes: 5235 Lines: 86 David Brown wrote: > On 08/09/2024 23:34, Waldek Hebisch wrote: >> David Brown wrote: >>> >>> And while microcontrollers sometimes have a limited form of branch >>> prediction (such as prefetching the target from cache), the more >>> numerous and smaller devices don't even have instruction caches. >>> Certainly none of them have register renaming or speculative execution. >> >> IIUC STM4 series has cache, and some of them are not so big. There >> are now several chinese variants of STM32F103 and some of them have >> caches (some very small like 32 words, IIRC one has 8 words and it >> is hard to decide if this very small cache or big prefetch buffer). > > There are different kinds of cache here. Some of the Cortex-M cores > have optional caches (i.e., the microcontroller manufacturer can choose > to have them or not). > > I do not see relevent information at that link. > Flash memory, flash controller peripherals, external memory interfaces > (including things like QSPI) are all specific to the manufacturer, > rather than part of the Cortex M cores from ARM. Manufacturers can do > whatever they want there. AFAIK typical Cortex-M design has core connected to "bus matrix". It is up to chip vendor to decide what else is connected to bus matrix. For me it does not matter if it is ARM design or vendor specific. Normal internal RAM is accessed via bus matrix, and in MCU-s that I know about is fast enough so that cache is not needed. So caches come into play only for flash (and possibly external memory, but design with external memory probably will be rather large). It seems that vendor do not like to say that they use cache, instead that use misleading terms like "flash accelerator". > So a "cache" of 32 words is going to be part of the flash interface, not > a cpu cache Well, caches never were part of CPU proper, they were part of memory interface. They could act for whole memory or only for part that need it (like flash). So I do not understand what "not a cpu cache" is supposed to mean. More relevant is if such thing act as a cache, 32 word things almost surely will act as a cache, 8 word thing may be a simple FIFO buffer (or may act smarter showing behaviour typical of caches). > (which are typically 16KB - 64KB, I wonder where you found this figure. Such size is typical for systems bigger than MCU-s. It could be useful for MCU-s with flash a on separate die, but with flash on the same die as CPU much smaller cache is adequate. > and only found on bigger > microcontrollers with speeds of perhaps 120 MHz or above). And yes, it > is often fair to call these flash caches "prefetch buffers" or > read-ahead buffers. Typical code has enough branches that simple read-ahead beyond 8 words is unlikely to give good results. OTOH delivering things that were accessed in the past and still present in the cache gives good results even with very small caches. > (You also sometimes see small caches for external > ram or dram interfaces.) > > >> A notable example is MH32F103. Base model officially has 64kB RAM and >> 256KB flash. AFAIK this flash is rather slow SPI flash. It also >> has 16kB cache which probably is 4-way set associative with few >> extra lines (probably 4) to increase apparent associativity. >> I write probably because this is result of reasoning based on >> several time measurements. If you hit cache it runs nicely at >> 216 MHz. Cache miss costs around 100 clocks (varies depending on >> exact setting of timing parameters and form of access). >> >> Similar technology seem to be popular among chines chip makers, >> especially for "bigger" chips. But IIUC GD use is for chips >> of size of STM32F103C8T6. >> > -- Waldek Hebisch