Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Critical word first Date: Tue, 11 Jun 2024 22:50:36 +0000 Organization: Rocksolid Light Message-ID: <76c63bc14c6e8c2208504b5b87548b9c@www.novabbs.org> References: <2024Jun9.185245@mips.complang.tuwien.ac.at> <38ob6jl9sl3ceb0qugaf26cbv8lk7hmdil@4ax.com> <2024Jun10.091648@mips.complang.tuwien.ac.at> <3a691dbdc80ebcc98d69c3a234f4135b@www.novabbs.org> <5a27391589243e11b610b14c3015ec09@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="3906970"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Rslight-Site: $2y$10$ntQUS8IBYUWlLcHpAhZ/XOMXkDDNNiWD20dVxLC3g.hML256/yLz2 Bytes: 2641 Lines: 38 Stefan Monnier wrote: >>> [...] is performed which *completely bypasses* the cache; [...] >> Yes, critical word first. > How important is this nowadays? With interconnect widths of ½ to 1 cache line per cycle, its utility has been significantly reduced. > From what I can tell, the bandwidth-time of transferring a whole cache > line is extremely short compared to the latency to the first word, so > I assumed that whether the critical word is send first or not wouldn't > make much of a difference. When one has 64 cores on a die, the interconnect BW basically has to be about 1 cache line per cycle. If we have cores with a 1% miss rate in the L2 cache, every 100 cycles we have to service 64 or more misses. So, you need a BW that big. Then once you have to mux out any given word, you might just as well hold the line and service outstanding requests from the buffer. The more CPUs per die and the greater the instruction width per cycle, the greater the BW of the cores<->memory interconnect. When the width is lower, CWF is more important. > Is my intuition wrong? > Or is it simply that the difference is small but the cost is > even smaller? When you can put 100 GBOoO cores on a chip, you should be able to feed them at appropriate rates. > Stefan