Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Tue, 17 Sep 2024 14:51:19 -0500 Organization: A noiseless patient Spider Lines: 326 Message-ID: References: <86r09ulqyp.fsf@linuxsc.com> <2024Sep8.173639@mips.complang.tuwien.ac.at> <2024Sep10.101932@mips.complang.tuwien.ac.at> <2024Sep11.123824@mips.complang.tuwien.ac.at> <867cbhgozo.fsf@linuxsc.com> <20240912142948.00002757@yahoo.com> <20240915001153.000029bf@yahoo.com> <20240915154038.0000016e@yahoo.com> <2024Sep15.194612@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Tue, 17 Sep 2024 21:52:36 +0200 (CEST) Injection-Info: dont-email.me; posting-host="8ed85148bb5d5f0c82d6d89b8b7f5300"; logging-data="3870801"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19+wluMnijsbLBwTic63JhPsKa/OYN5ZDw=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:gDy4Z5rJg17DKyh51vYpLiwyqjo= In-Reply-To: Content-Language: en-US Bytes: 14508 On 9/17/2024 4:39 AM, David Brown wrote: > On 16/09/2024 21:46, BGB wrote: >> On 9/16/2024 4:27 AM, David Brown wrote: >>> On 16/09/2024 09:18, BGB wrote: >>>> On 9/15/2024 12:46 PM, Anton Ertl wrote: >>>>> Michael S writes: >>>>>> Padding is another thing that should be Implementation Defined. >>>>> >>>>> It is.  It's defined in the ABI, so when the compiler documents to >>>>> follow some ABI, you automatically get that ABI's structure layout. >>>>> And if a compiler does not follow an ABI, it is practically useless. >>>>> >>>> >>>> Though, there also isn't a whole lot of freedom of choice here >>>> regarding layout. >>>> >>>> If member ordering or padding differs from typical expectations, >>>> then any code which serializes structures to files is liable to >>>> break, and this practice isn't particularly uncommon. >>>> >>> >>> Your expectations here should match up with the ABI - otherwise >>> things are going to go wrong pretty quickly.  But I think most ABIs >>> will have fairly sensible choices for padding and alignments. >>> >> >> Yeah. It is "almost fixed", as there are a lot of programs that are >> liable to break if these assumptions differ. >> >> >>>> Say, typical pattern: >>>> Members are organized in the same order they appear in the source code; >>> >>> That is required by the C standards.  (A compiler can re-arrange the >>> order if that does not affect any observable behaviour.  gcc used to >>> have an optimisation option that allowed it to re-arrange struct >>> ordering when it was safe to do so, but it was removed as it was >>> rarely used and a serious PITA to support with LTO.) >>> >> >> OK. >> >> >>>> If the current position is not a multiple of the member's alignment, >>>> it is padded to an offset that is a multiple of the member's alignment; >>> >>> That is a requirement in the C standards. >>> >>> The only implementation-defined option is whether or not there is / >>> extra/ padding - and I have never seen that in practice.  (And there >>> are more implementation-defined options for bit-fields.) >>> >> >> Extra padding seems like it wouldn't have much benefit. > > No, generally not - which is why it would be a really strange > implementation if it had extra padding.  It's possible that extra > padding at the end of a struct could lead to more efficient array access > by aligning to cache line sizes, but I think such things are better left > to the programmer (possibly with the aid of compiler extensions) rather > than attempting to specify them in the ABI. > I haven't seen much in this area. Usually just normally aligned structs, and packed structs. >> >> Albeit, types like _Bool in my implementation are padded to a full >> byte (it is treated as an "unsigned char" that is assumed to always >> hold either 0 or 1). > > That's the usual way to handle them. > Another option would be for adjacent _Bool values to merge similar to bitfields... Though, seems that simply turning it into a byte is the typical option. >> >> >>>> For primitive types, the alignment is equal to the size, which is >>>> also a power of 2; >>> >>> That is the norm, up to the maximum appropriate alignment for the >>> architecture.  A 16-bit cpu has nothing to gain by making 32-bit >>> types 32-bit aligned. >>> >> >> This comes up as an issue in some Windows file formats, where one >> can't just naively use a struct with 32-bit fields because some 32-bit >> members only have 16-bit alignment. > > Ah, the joys of using ancient formats with new systems! > I was around when this stuff was still newish. Some are essentially frozen in time with their misaligned members. Still better than: "Well, initial field wasn't big enough"; "Repurpose those bytes from over there, and glue them on". > My comment above was in reference to data remaining on the system, > rather than moving off-system. > > If I am making a format that is accessible externally - a file format, a > network packet, etc., - I generally make sure all types are "naturally" > aligned up to at least 8-byte types, even if the processor's maximum > useful alignment is much smaller. > Makes sense. I usually also try to design things with everything properly aligned and (typically) any structures that are used in arrays having a power-of-2 size. But, it seems people coming up with the file formats in the 80s/90s were a little more lax. >> >>>> If needed, the total size of the struct is padded to a multiple of >>>> the largest alignment of the struct members. >>> >>> That is required by the C standards. >>> >>>> >>>> >>>> >>>> For C++ classes, it is more chaotic (and more compiler dependent), but: >>> >>> Not really, no.  Apart from a few hidden bits such as pointers to >>> handle virtual methods and virtual inheritance, the data fields are >>> ordered, padded and aligned just like in C structs.  And these hidden >>> pointers follow the same rules as any other pointer. >>> >>> The only other special bit is empty base class optimisation, and >>> that's pretty simple too. >>> >> >> For simple cases, they may match up, like a POD class may look just >> like an equivalent struct, or single-inheritance classes with virtual >> methods like a struct with a vtable, etc... But in more complex cases >> there may be compiler differences (along with differences in things >> like name mangling, etc). > > I've never seen or header of a case where there there is anything > unexpected here. > > Sure, different C++ implementations or ABIs might have different details > around these hidden pointers and the way they organise their vtables. > But they are still hidden /pointers/, and these are aligned and padded > like any other pointer.  Even if the hidden data contained a bunch of > extra bits, flags, etc., to handle complicated inheritance setups, these > would still be padded and aligned like any other structs with bits, > flags, etc. > OK. I had thought you were implying that if one took two arbitrary C++ compilers (with different ABIs), with the same class definitions, that they would always end up with the same in-memory layout. This is not my experience though, say: The specifics of how multiple inheritance and virtual inheritance are ========== REMAINDER OF ARTICLE TRUNCATED ==========