Path: ...!feeds.phibee-telecom.net!2.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Michael S Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Sun, 13 Oct 2024 11:30:52 +0300 Organization: A noiseless patient Spider Lines: 218 Message-ID: <20241013113052.00001b54@yahoo.com> References: <2024Aug30.161204@mips.complang.tuwien.ac.at> <2024Aug30.195831@mips.complang.tuwien.ac.at> <2024Aug31.170347@mips.complang.tuwien.ac.at> <8lcadjhnlcj5se1hrmo232viiccjk5alu4@4ax.com> <17d615c6a9e70e9fabe1721c55cfa176@www.novabbs.org> <86v7zep35n.fsf@linuxsc.com> <20240902180903.000035ee@yahoo.com> <86r0a2otte.fsf@linuxsc.com> <86ed61pfus.fsf@linuxsc.com> <20240903114042.00000be5@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Sun, 13 Oct 2024 10:30:19 +0200 (CEST) Injection-Info: dont-email.me; posting-host="892c1d91ddc3c98c5e442b7e028c5cd9"; logging-data="3378373"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+t5NOhpFpzFGqt4k8Hbrwf/LsbSKoY4cY=" Cancel-Lock: sha1:IB2VtS0hzUhz0UPviCrSbBSYgng= X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32) Bytes: 10773 On Thu, 5 Sep 2024 20:08:23 -0500 BGB wrote: > On 9/3/2024 3:40 AM, Michael S wrote: > > On Tue, 3 Sep 2024 05:55:14 -0000 (UTC) > > Thomas Koenig wrote: > > > >> Tim Rentsch schrieb: > >> > >>> My suggestion is not to implement a language extension, but to > >>> implement a compiler conforming to C as it is now, > >> > >> Sure, that was also what I was suggesting - define things that > >> are currently undefined behavior. > >> > >>> with > >>> additional guarantees for what happens in cases that are > >>> undefined behavior. > >> > >> Guarantees or specifications - no difference there. > >> > >>> Moreover the additional guarantees are > >>> always in effect unless explicitly and specifically requested > >>> otherwise (most likely by means of a #pragma or _Pragma). > >>> Documentation needs to be written for the #pragmas, but no other > >>> documentation is required (it might be nice to describe the > >>> additional guarantees but that is not required by the C > >>> standard). > >> > >> It' the other way around - you need to describe first what the > >> actual behavior in absence of any pragmas is, and this needs to be > >> a firm specification, so the programmer doesn't need to read your > >> mind (or the source code to the compiler) to find out what you > >> meant. "But it is clear that..." would not be a specification; > >> what is clear to you may absolutely not be clear to anybody else. > >> > >> This is also the only chance you'll have of getting this > >> implemented in one of the current compilers (and let's face it, if > >> you want high-quality code, you would need that; both LLVM and GCC > >> have taken an enormous amount of effort up to now, and duplicating > >> that is probably not going to happen). > >> > >>> The point is to change the behavior of the compiler but > >>> still conform to the existing ISO C standard. > >> > >> I understood that - defining things that are currently undefined. > >> But without a specification, that falls down. > >> > >> So, let's try something that causes some grief - what should > >> be the default behavior (in the absence of pragmas) for integer > >> overflow? More specifically, can the compiler set the condition > >> to false in > >> > >> int a; > >> > >> ... > >> > >> if (a > a + 1) { > >> } > >> > >> and how would you specify this in an unabigous manner? > > > > I'd start much earlier, by declaration of "Homogeneity and > > Exclusion". It would state that "more defined C" does not pretend > > to cover all targets covered by existing C language. > > Specifically, following target characteristics are required: > > - byte-addressable machine with 8-bit bytes > > - two-complement integer types > > - if float type is supported it has to be IEEE-754 binary32 > > - if double type is supported it has to be IEEE-754 binary64 > > - if long double type is supported it has to be IEEE-754 binary128 > > - storage order for multibyte types should be either LE or BE, > > consistently for all built-in types > > - flat address space That part should be specified in more formal > > manner > > I might add a few things. > > ALU: > If integer types overflow, they wrap, with any internal sign or zero > extension consistent with the declared type; > If a multiply overflows, the result will contain the low-order bits > of the product, sign or zero extended according to the declared types; > If a variable is shifted left, it will behave as-if it were sign or > zero extended in a way consistent with the type; > If a signed value is shifted right, its high order bits will remain > consistent with the original sign bit. > > > So, in the above example, one could see: > if (a > a + 1) { } > As a hypothetical: > if (a > SignExtend32(a + 1)) { } > Where SignExtent32 returns the input value sign-extended from 32 bits > (a+1 always incrementing the value, but may conceptually either wrap > or go outside the allowed range for 'int', with the sign extension > always returning it to its canonical form, seen as twos complement). > > > I will not define the behavior of shifts greater than or equal to the > modulo of the integer size, or of negative shifts, as there isn't a > consistent behavior here across targets. > > However, will note for shifting in a constant expression, it does > seem to be the case, that the shift will behave as-if the width was > unbounded, and negative shifts as a shift in the opposite direction, > with the result then being sign or zero extended in accordance with > the type. > > Say, for example, zigzag sign folding: > int32_t i, j, k; > i=somevalue; > j=(i<<1)^(i>>31); //fold sign into LSB > k=(j>>1)^((j<<31)>>31); > assert(k==i); > > > Memory: > One may freely cast pointers to different types and dereference them, > regardless of types or alignment of said pointers; > Pointers will behave as-if the memory space were a linear array of > bytes, with each value as one or more contiguous bytes in memory; > Structs are normally packed with each member stored sequentially in > memory, with each member padded to its natural alignment, and the > overal struct, if needed, padded to a multiple of the largest member > alignment; The natural alignment for primitive types is equal to the > size of said primitive type; > The address taken of any variable will have an in-memory layout > consistent with the declared type; > ... > > Implicitly: > Any memory store may potentially alias with any other memory access, > unless: One or both pointers has the restrict keyword; > It can be reasonably proven that the pointed-to memory locations do > not alias; > A compiler may assume an access is aligned if it can be verified that > no operation has caused the address to become misaligned (though, as > a reservation, may assume that if a variable is declared restrict, it > may also be assumed to be properly aligned for its type). > > > Granted, there are targets where pointers are assumed aligned by > default and declared unaligned, but there is no standard way in C to > declare an unaligned pointer, and there is code that assumes the > ability to freely de-reference pointers regardless of alignment. > > Though, a less conservative option would be to assume that any normal > pointer variable is aligned by default, but may become unaligned if > it accepts a value created by casting from a type of smaller > alignment (or is assigned a value from a pointer holding such a > value). > > char *cs; > int *pi, *pj; > ... > pi=(int *)cs; //taints pi with unaligned status. > .. > pj=pi; //taints pj with unaligned status via pi > > This would still leave it as UB to pass or return a misaligned ========== REMAINDER OF ARTICLE TRUNCATED ==========