Deutsch English Français Italiano |
<2025Feb27.191846@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.lang.forth Subject: Re: On my AMD FX-8370 I don't benefit from a compact code area. Date: Thu, 27 Feb 2025 18:18:46 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 85 Message-ID: <2025Feb27.191846@mips.complang.tuwien.ac.at> References: <nnd$5ebf0a90$7fda5189@6e06ea9bf1a470ef> Injection-Date: Thu, 27 Feb 2025 19:47:44 +0100 (CET) Injection-Info: dont-email.me; posting-host="139269432581838c150e739c2a2a64f6"; logging-data="3404429"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18uywpDQ4stGaCrgo6UaPfa" Cancel-Lock: sha1:m3QUHK+CwcHzm1u33xDUuz9Zqpk= X-newsreader: xrn 10.11 albert@spenarnc.xs4all.nl writes: >I test lina64 on my AMD FX-8370 8 core 4 Ghz. > >The genuine Byte benchmark sieve takes 1.5 ms on my unmodified lina. >That is a indirect threaded Forth with no optimisation and all the >machine code scattered throughout the dictionary. > >I build a version where there is actually a code segment and all code is >collected there. There was no significant difference in speed. > >All the code of the Forth fits comfortable in the L1 cache. >Is this to be expected? >An L1 cache hit is an L1 cache hit? Not at all. Since the Pentium and the K5 (I think) there is an instruction cache and a data cache (and then uop caches, which can be seen as a kind of instruction cache). However, apart from the early ones (Pentium, K6, and probably K5), the same grains (with typically 64-byte granularity these days) can reside in both the I-cache and the D-cache, as long as that grain is not written to. So if your complete Forth system including the primitives and the sieve program fits into the D-cache and fits into the I-cache, and you have no writes close to code, you will indeed only see compulsory misses. I have posted here about the performance pitfalls of keeping code close to data since 1995, and Forth system implementors typically have taken measures only when I presented benchmark results where there system looks bad. But they usually only did the minimum necessary for that particular benchmark, so over the years the issue has come up again and again. One interesting aspect is that small benchmarks like the sieve are often not affected, but larger application benchmarks are. E.g., in my recent work [ertl24] all the small benchmarks are unaffected by the problem, whereas several of the larger benchmarks were affected in SwiftForth-4.0.0-RC87 and saw significant speedups from a fix in RC89. So I applaud that you have done the right thing and completely separated code from data. You may not see a benefit on Sieve, but there may be a difference in a different program (and you may not even notice until you measure both variants). @InProceedings{ertl24, author = {M. Anton Ertl}, title = {How to Implement Words (Efficiently)}, crossref = {euroforth24}, pages = {43--52}, url = {http://www.euroforth.org/ef24/papers/ertl.pdf}, url-slides = {http://www.euroforth.org/ef24/papers/ertl-slides.pdf}, video = {https://www.youtube.com/watch?v=bAq4760h5ZQ}, OPTnote = {not refereed}, abstract = {The implementation of Forth words has to satisfy the following requirements: 1) A word must be represented by a single cell (for \code{execute}). 2) A word may represent a combination of code and data (for, e.g., \code{does>}). In addition, on some hardware, keeping executed native code and (written) data close together results in slowness and therefore should be avoided; moreover, failing to pair up calls with returns results in (slow) branch mispredictions. The present work describes how various Forth systems over the decades have satisfied the requirements, and how many systems run into performance pitfalls in various situations. This paper also discusses how to avoid this slowness, including in native-code systems.} } @Proceedings{euroforth24, title = {40th EuroForth Conference}, booktitle = {40th EuroForth Conference}, year = {2024}, key = {EuroForth'24}, url = {http://www.euroforth.org/ef24/papers/proceedings.pdf} } - anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: https://forth-standard.org/ EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/ EuroForth 2024 proceedings: http://www.euroforth.org/ef24/papers/