Deutsch English Français Italiano |
<vca209$319ci$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB <cr88192@gmail.com> Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Mon, 16 Sep 2024 14:46:06 -0500 Organization: A noiseless patient Spider Lines: 163 Message-ID: <vca209$319ci$1@dont-email.me> References: <vaqgtl$3526$1@dont-email.me> <86r09ulqyp.fsf@linuxsc.com> <2024Sep8.173639@mips.complang.tuwien.ac.at> <p1cvdjpqjg65e6e3rtt4ua6hgm79cdfm2n@4ax.com> <2024Sep10.101932@mips.complang.tuwien.ac.at> <ygn8qvztf16.fsf@y.z> <2024Sep11.123824@mips.complang.tuwien.ac.at> <vbsoro$3ol1a$1@dont-email.me> <867cbhgozo.fsf@linuxsc.com> <20240912142948.00002757@yahoo.com> <vbuu5n$9tue$1@dont-email.me> <20240915001153.000029bf@yahoo.com> <vc6jbk$5v9f$1@paganini.bofh.team> <20240915154038.0000016e@yahoo.com> <2024Sep15.194612@mips.complang.tuwien.ac.at> <vc8m5k$2nf2l$1@dont-email.me> <vc8tlj$2od19$3@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Mon, 16 Sep 2024 21:47:22 +0200 (CEST) Injection-Info: dont-email.me; posting-host="76dde52cc591d309c4359b36d46257d9"; logging-data="3188114"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ROSNjC6+hM91C4By5cQrHLQBHY1zDS/U=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:FDU7EELa94S0UHafHoc5W2YbzQY= Content-Language: en-US In-Reply-To: <vc8tlj$2od19$3@dont-email.me> Bytes: 8275 On 9/16/2024 4:27 AM, David Brown wrote: > On 16/09/2024 09:18, BGB wrote: >> On 9/15/2024 12:46 PM, Anton Ertl wrote: >>> Michael S <already5chosen@yahoo.com> writes: >>>> Padding is another thing that should be Implementation Defined. >>> >>> It is. It's defined in the ABI, so when the compiler documents to >>> follow some ABI, you automatically get that ABI's structure layout. >>> And if a compiler does not follow an ABI, it is practically useless. >>> >> >> Though, there also isn't a whole lot of freedom of choice here >> regarding layout. >> >> If member ordering or padding differs from typical expectations, then >> any code which serializes structures to files is liable to break, and >> this practice isn't particularly uncommon. >> > > Your expectations here should match up with the ABI - otherwise things > are going to go wrong pretty quickly. But I think most ABIs will have > fairly sensible choices for padding and alignments. > Yeah. It is "almost fixed", as there are a lot of programs that are liable to break if these assumptions differ. >> Say, typical pattern: >> Members are organized in the same order they appear in the source code; > > That is required by the C standards. (A compiler can re-arrange the > order if that does not affect any observable behaviour. gcc used to > have an optimisation option that allowed it to re-arrange struct > ordering when it was safe to do so, but it was removed as it was rarely > used and a serious PITA to support with LTO.) > OK. >> If the current position is not a multiple of the member's alignment, >> it is padded to an offset that is a multiple of the member's alignment; > > That is a requirement in the C standards. > > The only implementation-defined option is whether or not there is / > extra/ padding - and I have never seen that in practice. (And there are > more implementation-defined options for bit-fields.) > Extra padding seems like it wouldn't have much benefit. I didn't originally implement bitfields "properly", but had instead handled them by converting them to the smallest integer type with sufficient bits. IIRC, I had since gone and re-implemented bitfields to be more proper, namely packing the bits LSB first into a container of the corresponding base type (and advancing to a new container once the prior one was filled). Albeit, types like _Bool in my implementation are padded to a full byte (it is treated as an "unsigned char" that is assumed to always hold either 0 or 1). >> For primitive types, the alignment is equal to the size, which is also >> a power of 2; > > That is the norm, up to the maximum appropriate alignment for the > architecture. A 16-bit cpu has nothing to gain by making 32-bit types > 32-bit aligned. > This comes up as an issue in some Windows file formats, where one can't just naively use a struct with 32-bit fields because some 32-bit members only have 16-bit alignment. Say, one can't just express the entire BMP header as a single struct for alignment reasons (though luckily at least, BITMAPINFOHEADER is properly aligned, and one of those "set in stone" structures). Though, there are variants of BMP with other versions of the header, but they are less common (some older pre-Win-3.x formats that pretty much nothing uses or supports, or some newer bigger headers that are only really seen if dealing with 32-bit RGBA truecolor images and/or saving BMPs from GIMP, *1). One might ask, "Does RDIB avoid this issue?" No, it does not. Relatively little software supports RDIB either (Where, "Essentially BMP, but inside RIFF packaging", didn't really catch on). Sort of similar for ".MID" vs ".RMI" (where, reworking the MIDI file format to use RIFF packaging was not much of a "value added"). *1: Though, for 24/32 bit images, I more typically use TGA as a default format, with BMP primarily for 16-color or 256-color images. for 256-color, there is PCX, which (like TGA) supports RLE compression. And QOI which can give almost PNG-like compression in some cases, albeit in its common form has some limitations similar to PCX. Decided to leave off going too much more into this. Still not as chaotic as some MS-DOS era formats (like the FAT filesystem) where it is usually needed to stitch fields together from individual bytes. Well, and there are some formats that originated on the Amiga that have both 16-bit alignment for 32-bit fields and are big endian. And, formats that have big endian fields "just because" (many people thinking that "network" or "interchange" means big endian, even if nearly everything it would be used on is likely to be little endian). Though, for the most part, new formats being designed as conventional structure-based file-formats died down in the 2000s, seemingly largely displaced by aggregate files typically consisting of a ZIP based container holding other files and usually gluing everything together with globs of XML. Formats which don't follow this pattern are often based around linear serialized encodings or similar. Though, one merit of 80s/90s era formats is that they tend to have a level of relative simplicity and can be processed in ways that don't burn through large amounts of RAM and/or CPU power. Well, and some cases where people don't bother to document the format, instead treating it as if their reader/writer API implementation *is* the file-format. .... >> If needed, the total size of the struct is padded to a multiple of the >> largest alignment of the struct members. > > That is required by the C standards. > >> >> >> >> For C++ classes, it is more chaotic (and more compiler dependent), but: > > Not really, no. Apart from a few hidden bits such as pointers to handle > virtual methods and virtual inheritance, the data fields are ordered, > padded and aligned just like in C structs. And these hidden pointers > follow the same rules as any other pointer. > > The only other special bit is empty base class optimisation, and that's > pretty simple too. > For simple cases, they may match up, like a POD class may look just like an equivalent struct, or single-inheritance classes with virtual methods like a struct with a vtable, etc... But in more complex cases there may be compiler differences (along with differences in things like name mangling, etc). Though, unlike with structs, programs seem less inclined to rely on the memory layout specifics of class instances.