Path: ...!news-out.netnews.com!news.alt.net!us1.netnews.com!weretis.net!feeder6.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Performance monitoring Date: Tue, 26 Mar 2024 18:47:38 +0000 Organization: Rocksolid Light Message-ID: <0d50af01e9217c15ecb945e0b643b597@www.novabbs.org> References: <2024Mar25.193535@mips.complang.tuwien.ac.at> <2024Mar26.102754@mips.complang.tuwien.ac.at> <2024Mar26.174702@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="3308076"; mail-complaints-to="usenet@i2pn2.org"; posting-account="PGd4t4cXnWwgUWG9VtTiCsm47oOWbHLcTr4rYoM0Edo"; User-Agent: Rocksolid Light X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$OKDE8d5uONjoWITDyfbfFu4OMBt28f.IGCi1X5RfNkm0t3gUzyzam Bytes: 2544 Lines: 34 Anton Ertl wrote: > scott@slp53.sl.home (Scott Lurndal) writes: >>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes: >>>scott@slp53.sl.home (Scott Lurndal) writes: >>>>The biggest demand is from the OS vendors. Hardware folks have >>>>simulation and emulators. >>> >>>You don't want to use a full-blown microarchitectural emulator for a >>>long-running program. >> >>Generally hardware folks don't run 'long-running programs' when >>analyzing performance, they use the emulator for determining latencies, >>bandwidths and efficiacy of cache coherency algorithms and >>cache prefetchers. >> >>Their target is not application analysis. > This sounds like hardware folks that are only concerned with > memory-bound programs. > I OTOH expect that designers of out-of-order (and in-order) cores > analyse the performance of various programs to find out where the > bottlenecks of their microarchitectures are in benchmarks and > applications that people look at to determine which CPU to buy. And > that's why we not only just have PMCs for memory accesses, but also > for branch prediction accuracy, functional unit utilization, scheduler > utilization, etc. Quit being so CPU-centric. You also need measurement on how many of which transactions few across the bus, DRAM use analysis, and PCIe usage to fully tune the system. > - anton