Deutsch English Français Italiano |
<5140da0c7db5686c4bb9948276454914@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder6.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Another security vulnerability Date: Sat, 30 Mar 2024 01:06:23 +0000 Organization: Rocksolid Light Message-ID: <5140da0c7db5686c4bb9948276454914@www.novabbs.org> References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at> <HH_MN.732789$p%Mb.8039@fx15.iad> <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org> <TfhNN.110764$_a1e.90012@fx16.iad> <14b25c0880216e54fe36d28c96e8428c@www.novabbs.org> <uu56rq$3u2ve$1@dont-email.me> <%1ANN.756839$p%Mb.622365@fx15.iad> <uu7bj0$h78h$1@dont-email.me> <8mHNN.117982$Sf59.36214@fx48.iad> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="3666007"; mail-complaints-to="usenet@i2pn2.org"; posting-account="PGd4t4cXnWwgUWG9VtTiCsm47oOWbHLcTr4rYoM0Edo"; User-Agent: Rocksolid Light X-Rslight-Site: $2y$10$owmzDK56O5ZGiRHToArvfedFCPJEc9BcNDqur37Zqnx4uER6m/NZm X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 4209 Lines: 60 Scott Lurndal wrote: > "Paul A. Clayton" <paaronclayton@gmail.com> writes: >>On 3/29/24 10:15 AM, Scott Lurndal wrote: >>> "Paul A. Clayton" <paaronclayton@gmail.com> writes: >>>> On 3/28/24 3:59 PM, MitchAlsup1 wrote: >>[snip] >>>> However, even for a "general purpose" processor, "word"-granular >>>> atomic operations could justify not having all data transfers be >>>> cache line size. (Such are rare compared with cache line loads >>>>from memory or other caches, but a design might have narrower >>>> connections for coherence, interrupts, etc. that could be used for >>>> small data communication.) >>> >>> So long as the data transfer is cachable, the atomics can be handled >>> at the LLC, rather than the memory controller. >> >>Yes, but if the width of the on-chip network — which is what Mitch >>was referring to in transferring a cache line in one cycle — is >>c.72 bytes (64 bytes for the data and 8 bytes for control >>information) it seems that short messages would either have to be >>grouped (increasing latency) or waste a significant fraction of >>the potential bandwidth for that transfer. Compressed cache lines >>would also not save bandwidth. These may not be significant >>considerations, but this is an answer to "why define anything >>smaller than a cache line?", i.e., seemingly reasonable >>motivations may exist. >> > It's not uncommon for the bus/switch/mesh -structure- to be 512-bits wide, > which indeed will support a full cache line transfer in a single transaction; It is not the transaction it is a single beat of the clock. One can have narrower bus widths and simply divide the cache line size by the bus width to get the number of required beats. > it also supports high-volume DMA operations (either memory to memory or > device to memory). > Most of the interconnect (bus, switched or point-to-point) implementations > have an or more than one > overlaying protocol (including the cache coherency > protocol) and are effectively message based, with agents posting requests > that don't need a reply and expecting a reply for the rest. Many older busses read PTP and PTEs from memory sizeof( PTE ) at a time, some of them requesting write permission so that used and modified bits can be written back immediately.{{Which skirts the distinction between cacheable and uncacheable in several ways.}} > That doesn't require that every transaction over that bus to > utilize the full width of the bus. In my wide bus situation, the line width is used to gang up multiple responses (from different end-points) into a single beat==message. For example the chip-to-chip transport can carry multiple independent SNOOP responses in a single beat (saving cycles and lowering latency).