Deutsch English Français Italiano |
<vfporh$1dqvu$2@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> Newsgroups: comp.lang.c++ Subject: Re: smrproxy v2 Date: Mon, 28 Oct 2024 21:38:09 -0700 Organization: A noiseless patient Spider Lines: 79 Message-ID: <vfporh$1dqvu$2@dont-email.me> References: <vequrc$2o7qc$1@dont-email.me> <verr04$2stfq$1@dont-email.me> <verubk$2t9bs$1@dont-email.me> <ves78h$2ugvm$2@dont-email.me> <vetj1f$39iuv$1@dont-email.me> <vfh4dh$3bnuq$1@dont-email.me> <vfh7mg$3c2hs$1@dont-email.me> <vfm4iq$ill4$1@dont-email.me> <vfmesn$k6mn$1@dont-email.me> <vfmf21$kavl$1@dont-email.me> <vfmm9a$lob3$1@dont-email.me> <vfn2di$r8ca$1@dont-email.me> <vfntgb$vete$1@dont-email.me> <vfp1c3$16d9f$1@dont-email.me> <vfpd43$186t4$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Tue, 29 Oct 2024 05:38:10 +0100 (CET) Injection-Info: dont-email.me; posting-host="09272f6de4c78f3ebf892cfc59a706c2"; logging-data="1502206"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+mU+qxj4+sOd5NQOGuR86ngqy9oujXUPY=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:+XF6uOU2gSOnSo981cJfTOrW8rM= Content-Language: en-US In-Reply-To: <vfpd43$186t4$1@dont-email.me> Bytes: 4624 On 10/28/2024 6:17 PM, jseigh wrote: > On 10/28/24 17:57, Chris M. Thomasson wrote: >> On 10/28/2024 4:45 AM, jseigh wrote: >>> On 10/28/24 00:02, Chris M. Thomasson wrote: >>>> On 10/27/2024 5:35 PM, jseigh wrote: >>>>> On 10/27/24 18:32, Chris M. Thomasson wrote: >>> >>>>> >>>>> The membar version? That's a store/load membar so it is expensive. >>>> >>>> I was wondering in your c++ version if you had to use any seq_cst >>>> barriers. I think acquire/release should be good enough. Now, when I >>>> say C++, I mean pure C++, no calls to FlushProcessWriteBuffers and >>>> things like that. >>>> >>>> I take it that your pure C++ version has no atomic RMW, right? Just >>>> loads and stores? >>> >>> While a lock action has acquire memory order semantics, if the >>> implementation has internal stores, you have to those stores >>> are complete before any access from the critical section. >>> So you may need a store/load memory barrier. >> >> Wrt acquiring a lock the only class of mutex logic that comes to mind >> that requires an explicit storeload style membar is Petersons, and >> some others along those lines, so to speak. This is for the store and >> load version. Now, RMW on x86 basically implies a StoreLoad wrt the >> LOCK prefix, XCHG aside for it has an implied LOCK prefix. For >> instance the original SMR algo requires a storeload as is on x86/x64. >> MFENCE or LOCK prefix. >> >> Fwiw, my experimental pure C++ proxy works fine with XADD, or atomic >> fetch-add. It needs an explicit membars (no #StoreLoad) on SPARC in >> RMO mode. On x86, the LOCK prefix handles that wrt the RMW's >> themselves. This is a lot different than using stores and loads. The >> original SMR and Peterson's algo needs that "store followed by a load >> to a different location" action to hold true, aka, storeload... >> >> Now, I don't think that a data-dependant load can act like a >> storeload. I thought that they act sort of like an acquire, aka >> #LoadStore | #LoadLoad wrt SPARC. SPARC in RMO mode honors data- >> dependencies. Now, the DEC Alpha is a different story... ;^) >> > > fwiw, here's the lock and unlock logic from smrproxy rewrite > > inline void lock() > { > epoch_t _epoch = shadow_epoch.load(std::memory_order_relaxed); > _ref_epoch.store(_epoch, std::memory_order_relaxed); > std::atomic_signal_fence(std::memory_order_acquire); > } > > inline void unlock() > { > _ref_epoch.store(0, std::memory_order_release); > } > > epoch_t is interesting. It's uint64_t but handles wrapped > compares, ie. for an epoch_t x1 and uint64_t n Only your single polling thread can mutate the shadow_epoch, right? > > x1 < (x1 + n) > > for any value of x1 and any value of n from 0 to 2**63; > eg. > 0xfffffffffffffff0 < 0x0000000000000001 > > > The rewrite is almost complete except for some thread_local > stuff. I think I might break off there. Most of the > additional work is writing the test code. I'm considering > rewriting it in Rust.