Deutsch English Français Italiano |
<v038qp$bmtm$4@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!2.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Paul A. Clayton" <paaronclayton@gmail.com> Newsgroups: comp.arch Subject: Re: "Mini" tags to reduce the number of op codes Date: Sat, 20 Apr 2024 20:02:07 -0400 Organization: A noiseless patient Spider Lines: 40 Message-ID: <v038qp$bmtm$4@dont-email.me> References: <uuk100$inj$1@dont-email.me> <15d1f26c4545f1dbae450b28e96e79bd@www.novabbs.org> <lf441jt9i2lv7olvnm9t7bml2ib19eh552@4ax.com> <uuv1ir$30htt$1@dont-email.me> <d71c59a1e0342d0d01f8ce7c0f449f9b@www.novabbs.org> <uv02dn$3b6ik$1@dont-email.me> <uv415n$ck2j$1@dont-email.me> <uv46rg$e4nb$1@dont-email.me> <a81256dbd4f121a9345b151b1280162f@www.novabbs.org> <uv4ghh$gfsv$1@dont-email.me> <8e61b7c856aff15374ab3cc55956be9d@www.novabbs.org> <uv7h9k$1ek3q$1@dont-email.me> <7uSRN.161295$m4d.65414@fx43.iad> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sun, 21 Apr 2024 16:45:45 +0200 (CEST) Injection-Info: dont-email.me; posting-host="5d52f8e0f0694c11b30894cb014da68f"; logging-data="383926"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19RNxz5Vult0ot+r8mq3TjBe7d4O8k5Vzw=" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.0 Cancel-Lock: sha1:MAiTSLRlAoOTOuAgBoq0Y/V09GA= In-Reply-To: <7uSRN.161295$m4d.65414@fx43.iad> Bytes: 3471 On 4/11/24 10:30 AM, Scott Lurndal wrote: [snip] >> On 4/9/24 8:28 PM, MitchAlsup1 wrote: [snip] >>> MMs and MSs that do not cross page boundaries are ATOMIC. The >>> entire system >>> sees only the before or only the after state and nothing in >>> between. > > One might wonder how that atomicity is guaranteed in a > SMP processor... While Mitch Alsup's response ("The entire chunk of data traverses the interconnect as a single transaction." — I am not certain how that would work given reading up to a page and writing up to a page) provides one mechanism and probably the best one, theoretically the *data* does not need to be moved atomically but only the "ownership" (the source does not have to be owned in the traditional sense but needs to marked as readable by the copier). This is somewhat similar to My 66000's Exotic Synchronization Mechanism in that once all the addresses involved are known (the two ranges for memory copy), NAKs can be used for remote requests for "owned" cache lines while the copy is made. Only the visibility needs to be atomic. Memory set provides optimization opportunities in that the source is small. In theory, the set value could be sent to L3 with the destination range and all monitoring could be done at L3 and requested cache line sent immediately from L3 (hardware copy on access) — the first and last part of the range might be partial cache lines requiring read-for-ownership. For cache line aligned copies, a cache which used indirection between tags and data might not even copy the data but only the tag-related metadata. Some forms of cache compression might allow partial cache lines to be cached such that even unaligned copies might partially share data by having one tag indicate lossy compression with an indication of where the stored data is not valid, but that seems too funky to be practical.