Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Chris M. Thomasson" Newsgroups: comp.arch Subject: Re: Address space limits Date: Mon, 6 May 2024 20:39:02 -0700 Organization: A noiseless patient Spider Lines: 93 Message-ID: References: <62dff0b888855a31ec10c0597669423f@www.novabbs.org> <20240501225652.00002853@yahoo.com> <9e81a0aa95b5eae7ae6fc9f99455df97@www.novabbs.org> <468e2cebf7513075914022e2ffa02bff@www.novabbs.org> <665e650854e5b39b2e2f5f50561ed82e@www.novabbs.org> <7a135f25635aceda30ea05a2be397e66@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Tue, 07 May 2024 05:39:03 +0200 (CEST) Injection-Info: dont-email.me; posting-host="03a3d4bf4d28206368e24e650c9d72af"; logging-data="3209284"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+LEp3jbQXWbmrFqwoNi2XKAf0gHFOu4Ko=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:3wU59Xo9GmAgwhfPxUYGI+DxyNc= Content-Language: en-US In-Reply-To: <7a135f25635aceda30ea05a2be397e66@www.novabbs.org> Bytes: 5504 On 5/6/2024 7:04 PM, MitchAlsup1 wrote: > Chris M. Thomasson wrote: > >> On 5/6/2024 12:15 PM, MitchAlsup1 wrote: >>> Terje Mathisen wrote: >>> >>>> MitchAlsup1 wrote: >>>>> Chris M. Thomasson wrote: >>>>> >>>>>> On 5/5/2024 3:25 PM, MitchAlsup1 wrote: >>>>>>> Chris M. Thomasson wrote: >>>>>>> >>>>>>>> On 5/4/2024 5:12 PM, MitchAlsup1 wrote: >>>>>>>>> Chris M. Thomasson wrote: >>>>>>>>> >>>>>>>>>> On 5/4/2024 3:18 AM, Thomas Koenig wrote: >>>>>>>>>>> Lawrence D'Oliveiro schrieb: >>>>>>>>>>> >>>>>>>>>>>> Intel pushed this thing called the “x32” ABI into the >>>>>>>>>>>> Linux kernel >>>>>>>>> (and >>>>>>>>>>>> possibly some other places) some years ago. This was using >>>>>>>>>>>> the AMD64 >>>>>>>>>>>> instruction set, but with only 32-bit pointers. This way, >>>>>>>>>>>> you got the >>>>>>>>>>>> benefit of the extra registers, without the overhead of the >>>>>>>>>>>> longer >>>>>>>>>>>> addresses. >>>>>>>>>>> >>>>>>>>>>> That was Donald Knuth's idea. >>>>>>>>> >>>>>>>>>> Storing meta data in actual pointers, aka aligned on a larger >>>>>>>>>> boundary, is critical to many advanced lock/wait free >>>>>>>>>> algorithms as well. I remember storing an actual reference >>>>>>>>>> count in pointers before for a special type of counting. >>>>>>>>> >>>>>>>>> Even if one has multi-location ATOMICs ?? (as a single event ??) >>>>>>> >>>>>>>> This was a technique for storing data in a pointer. For >>>>>>>> instance, strong atomic reference counting we need to update a >>>>>>>> pointer _and_ a reference together atomically. This can easily >>>>>>>> be done with DWCAS, or double width compare and swap. So, on a >>>>>>>> 32 bit system we need 64 bit cas, for a 64 bit system we need >>>>>>>> 128 bit cas. However, sometimes we can pack the reference count >>>>>>>> in the pointer value itself if its aligned on a big enough >>>>>>>> boundary. Then we can update the pointer and the reference count >>>>>>>> using normal word based atomic RMW's. >>>>>>> >>>>>>> I understand why you had to pack the pointer and a chunk of data >>>>>>> into a >>>>>>> single container. >>>>>>> >>>>>>> What I don't understand is if you had easy access to >>>>>>> multi-container ATOMICs >>>>>>> the packing would be unnecessary--would it not ?? That is in one >>>>>>> ATOMIC event >>>>>>> you could update the pointer and the chunk of data independently >>>>>>> and not NEED >>>>>>> to store them in a single container. >>>>> >>>>>> Well, actually, a pessimistic word based fetch-and-add (LOCK XADD) >>>>>> is enough to increment the counter and load a pointer atomically >>>>>> all in one shot, loopless. Why would I need to use multi atomics >>>>>> with a possible loop to do that? >>>>> >>>>> Postulate that you have a 64-bit pointer and a 8-bit chunk 72-total >>>>> bits. >>>>> Further postulate that you need to update both in a single >>>>> non-blocking ATOMIC event. ... >>> >>>> "Any programming problem can be solved with an additional layer of >>>> indirection", so in this case you create a handle to that 72-bit >>>> item, and require all access to go via the handle? >>> >>> I am not trying to add an additional layer of indirection, I am >>> trying (unsuccessfully it appears) to get Chris to think outside of >>> the one >>> container ATOMIC box. > >> LOCK XADD vs a CAS loop? I prefer the former. > > Those are not the only options. Show me another one that does not need a loop and can be just as fast, or faster than a LOCK XADD... Did you read any of the code I posted? > >>> >>>> The addendum to the rule above is of course ", except the problem of >>>> too many layers of indirections". :-) >>> >>>> Terje