Deutsch English Français Italiano |
<lcsvquFjr4aU1@mid.individual.net> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder8.news.weretis.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Niklas Holsti <niklas.holsti@tidorum.invalid> Newsgroups: comp.arch Subject: Re: WCET analysis Date: Wed, 12 Jun 2024 10:08:14 +0300 Organization: Tidorum Ltd Lines: 83 Message-ID: <lcsvquFjr4aU1@mid.individual.net> References: <jai66jd4ih4ejmek0abnl4gvg5td4obsqg@4ax.com> <h0ib6j576v8o37qu1ojrsmeb5o88f29upe@4ax.com> <2024Jun9.185245@mips.complang.tuwien.ac.at> <38ob6jl9sl3ceb0qugaf26cbv8lk7hmdil@4ax.com> <2024Jun10.091648@mips.complang.tuwien.ac.at> <o32f6jlq2qpi9s1u8giq521vv40uqrkiod@4ax.com> <lcq749F7sotU1@mid.individual.net> <jwvsexjfasn.fsf-monnier+comp.arch@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: individual.net 14KWfXVIuJzNckM+8jg9MwzYoGj3x8xYOLWOVdmFKaWLrSr5nn Cancel-Lock: sha1:VmcDjaBLoZzjptDiP5zPMaBY0eI= sha256:1bidcaiYwgwPKSt7qtnUq5TikN/WR2anUCy2POVJnrM= User-Agent: Mozilla Thunderbird Content-Language: en-US In-Reply-To: <jwvsexjfasn.fsf-monnier+comp.arch@gnu.org> Bytes: 6151 On 2024-06-12 0:14, Stefan Monnier wrote: >> Not always. If the mistakenly speculated cache-fetch /evicted/ some other >> data from the (finite-sized) cache, and the evicted data are needed later on >> the /true/ execution path, the mistakenly speculated fetch has a /negative/ >> effect on performance. (This kind of "timing anomaly" is very bothersome in >> static WCET analysis.) > > BTW, I heard that nowadays "static WCET analysis" is typically based on > statistical models of the performance of various parts of the code. > So they don't give an ironclad worst case any more (contrary to what > you'd get back in the day where you could count cycles and there were no > caches), but rather one that is "very unlikely" to be wrong. > > I remember papers doing "real WCET" by analyzing the code to detect > those memory accesses which were guaranteed to be cache hits. Do you > know if this is still done nowadays? I haven't followed the WCET-analysis field closely for a few years, so I may not be up to date. But here is my understanding, with apologies for the long text: Fully static, non-probabilistic WCET analysis is no longer possible, practical or profitable for most "big" processors because of their complex microarchitectures, with out-of-order execution and speculation being the most bothersome parts. The difficulty of getting accurate information about the microarchitectures is a large factor, too, as is the cost of creating a model of such a microarchitecture, compared to the small market for such WCET tools for a particular microarchitecture. For in-order processors static analysis of cache hits and misses is very possible, and has a great deal of good theory behind it. It works quite well for I-caches, but for D-caches it is much harder and more pessimistic because of the difficulty of predicting the addresses of data accesses, which depends a lot on the nature of the application being analyzed. Cache analysis also becomes harder in the presence of preemption, multiple cores and multiple cache levels. However, any static WCET analysis tool with ambitions to give safe WCET bounds will certainly include static cache analysis, and can then give a safe upper bound on the WCET, but in some cases it may be quite pessimistic (over-estimated). Some time ago, even before the complex microarchitectures became an acute problem, a class of "WCET" tools was developed that rely on a combination of execution-time measurements of the basic code blocks, static analysis of the control flow graphs and possible execution paths, and a statistical method called extreme-value statistics that aims to estimate the probability of outlier values (extreme and unlikely values) for the sum of the observed distributions of the execution times of the basic blocks. These methods do not need a microarchitecture model for the processor, but do need a way to measure basic-block execution times (which is doubtful in an OoO and speculative processor) and a "good enough" test suite. AIUI, these tools are successful and are much used. However, it is not clear that the probability they compute for exceeding their WCET "bound" is accurate, because the statistical theory depends on the distributions of the execution times of the individual basic code blocks being independent and uncorrelated, which is false for the ordinary sort of microarchitectures. To overcome that problem, there has been work to create "randomized" HW components, for example randomized caches, where it is possible to make statistical models of each component and the statistics of different components are known, by design and construction, to be independent and uncorrelated. However, I don't know how far this work has progressed, and how completely a processor can be randomized in this sense. A couple of years ago I saw that one practical application of this work simply used ordinary hardware but proposed to pick a new random memory lay-out (link map) of the application every time the application was loaded and started, which seems a cop-out. Recently a new type of "WCET" tool has appeared that does not claim to do more than "estimate" or "explore" the execution time of an application on a processor that is described only by its visible architecture (instruction set) and some capacity and performance numbers. I don't know exactly how these tools work, but it seems likely that they use a simplified, in-order, non-speculative model of the processor and can then apply the classical "static WCET" analysis methods. They don't claim to provide an accurate or even probabilistic WCET bound, but rather help the user understand which parts of the application are likely to use most of the time, and give some confidence that the application will be fast enough on the real processor. This may be what most developers want and are content with.