Path: ...!weretis.net!feeder9.news.weretis.net!news.quux.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Chris M. Thomasson" Newsgroups: comp.arch Subject: Re: Strange asm generated by GCC... Date: Sun, 22 Dec 2024 19:49:18 -0800 Organization: A noiseless patient Spider Lines: 55 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Mon, 23 Dec 2024 04:49:18 +0100 (CET) Injection-Info: dont-email.me; posting-host="f0e44193300486c78d7a7f6e740e8e90"; logging-data="1101792"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX185XGjJMhtkMhMJtDac3o+KlKlcdOTKY/E=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:bBWfhy7ziiFItcuJafZQbaPG1TU= Content-Language: en-US In-Reply-To: Bytes: 3072 On 12/21/2024 2:37 AM, aph@littlepinkcloud.invalid wrote: > jseigh wrote: >> On 12/19/24 19:43, Chris M. Thomasson wrote: >>> Why in the world would GCC use an XCHG instruction for the following >>> code. The damn XCHG has an implied LOCK prefix! Yikes! >>> >> >> Speaking of strange code >> >> #include >> >> bool test1(std::atomic var, int addend) >> { >> int expected = var.load(std::memory_order_relaxed); >> int update = expected + addend; >> return var.compare_exchange_weak(expected, update, >> std::memory_order_acq_rel, std::memory_order_seq_cst); >> } >> >> This is asm for armv8-a clang 9.0.0 >> >> test1(std::atomic, int): >> ldr w8, [x0] >> ldaxr w9, [x0] >> cmp w9, w8 >> b.ne .LBB0_3 >> add w8, w8, w1 >> stlxr w9, w8, [x0] >> cbz w9, .LBB0_4 >> mov w0, wzr >> ret >> .LBB0_3: >> clrex >> mov w0, wzr >> ret >> .LBB0_4: >> mov w0, #1 >> ret >> >> I picked a version that just did ll/sc to avoid >> the question of whether a failed CASAL did a store or not. >> >> I don't see anything that forces a store memory barrier >> on all the fail paths. I could be missing something. > > Why would there be one? If the store does not take place, there's no > need for a memory barrier because there's no store for anyone to > synchronize with. The only effect of a failed weak CAS is a load. If > you really need a store on failure because of its side effect you can > always add one. Iirc, the membars for the success and failure can be "useful" for popping from a lock-free stack. Wrt the C++ API the CAS can give you the updated value on a failure. So, there is a load. Depending on what you are doing, it might require an acquire.