Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB Newsgroups: comp.lang.c Subject: Re: C23 thoughts and opinions Date: Fri, 7 Jun 2024 00:51:22 -0500 Organization: A noiseless patient Spider Lines: 188 Message-ID: References: <20240530170836.00005fa0@yahoo.com> <20240530180345.00003d9f@yahoo.com> <20240531161937.000063af@yahoo.com> <20240531162811.00006719@yahoo.com> <20240531164835.00007128@yahoo.com> <20240531173437.00003bee@yahoo.com> <22r6O.5934$xPJ1.2590@fx09.iad> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Fri, 07 Jun 2024 07:52:36 +0200 (CEST) Injection-Info: dont-email.me; posting-host="bcd181c8a8249ff7c60382eea8a36cf2"; logging-data="2061843"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19XGKSLhMsTJDzaYrXZ9dL4wm7NdYgmGIg=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:3oZnXkTSzNAXhAqFLAJXD6PkmIA= In-Reply-To: Content-Language: en-US Bytes: 7662 On 6/6/2024 4:38 PM, Scott Lurndal wrote: > BGB-Alt writes: >> On 5/31/2024 4:11 PM, Scott Lurndal wrote: >>> jak writes: >>>> bart ha scritto: >>>>> On 31/05/2024 15:34, Michael S wrote: >>>>>> On Fri, 31 May 2024 15:04:46 +0100 >>>>>> bart wrote: >>>>> >>>> >>> >>> >>> >>>>> >>>>> Instead of one compiler, here I used two compilers, a tool 'objcopy' >>>>> (which bizarrely needs to generate ELF format files) and lots of extra >>>>> ugly code. I also need to disregard whatever the hell _binary_..._size >>>>> does. >>>>> >>>>> But it works. >>>>> >>>>> >>>>> >>>> You could use the pe-x86-64 format instead of the elf64-x86-64 to reduce >>>> the size of the object. >>> >>> By a half dozen bytes, perhaps, and only if your binutils have been >>> built to support pe-x86-64: >>> >>> $ objcopy -I binary -O pe-x86-64 main.cpp /tmp/test1.o >>> objcopy:/tmp/test1.o: Invalid bfd target >>> >>> The ELF64 format has a 64 byte header, the string table and the >>> symbol table, and the remainder is the binary >>> data. The PE header may save a few bytes by using 32-bit fields in >>> the PE COFF header and symbol table. >>> >>> Note, you might want to trim your posts when replying with a one-sentence reply. >>> >> >> While I can't say much for using objcopy here (it is likely to be >> hindered by however the program was compiled and linked, in any case), >> in some other contexts PE/COFF can save more significant amounts of >> space vs ELF. >> >> >> In particular: >> >> PE/COFF typically only stores symbols for imports and exports, rather >> than for every symbol in the binary (though, IIRC, GCC+LD does tend to >> generate PE/COFF output with every symbol present, *1, so this advantage >> is mostly N/A if using GCC). > > $ man 1 strip > >> >> The PE/COFF base relocation format is more compact than the ELF64 >> relocation formats: >> ELF64 tends to spend 24 bytes for every symbol, and 24 bytes for each >> reloc; along with an ASCII string for every symbol. > > Use ELF32 then. > Generally, using ELF32 on 64-bit targets isn't a thing... Granted, if it were done, it could make sense. After all, this is more or less what PE/COFF is doing. There were changes to some of the headers for 64-bit PE32+, but all of the the address fields remain as 32 bits, etc. >> >> It also tends to redirect most calls and loads/stores for global >> variables through the GOT, rather than using PC-relative / RIP-relative >> addressing (or fixed displacements relative to a Global Pointer), >> causing the generated code to be larger (along with the size of the GOT). > > That has nothing to do with ELF, per se. The ELF format supports > dynamic linking. It does not require it. Generally, ELF binaries seem to come in two major variants: ET_EXEC: Flat static-linked binary that can only be loaded at a certain address; ET_DYN: Can be loaded at any address, but requires symbols and relocations, and does everything via a GOT. If you want the ability to load at any address, one needs PIE, which is an ET_DYN binary with all of the dynamic linking stuff; even if the program itself is static-linked. And, seemingly, within the format at it exists, there is no way to make a relocatable binary that does not have a GOT and symbol tables. Contrast PE/COFF: The import/export tables and base-relocation tables exist independently of each other; The base relocations's do not depend on a symbol table; .... Also, in contrast to 24 byte relocation entries, the average size of a base-reloc in PE/COFF is closer to 2 bytes (though, with 8 bytes 4K per-page, and 2 bytes for each reloc within that page). Experimentally, I had developed a more compact variant of the base relocs by using almost exclusively 16-bit values: 0000: No-Op / Pad / End 00zz: Adjust current position forward by zz pages; 1zzz..Bzzz: Apply a base-reloc of various types. Czzz..Fzzz: Escape into a larger set of reloc types. Generally, the high 4 bits giving the relocation type, and the low 12 giving the offset within the page. While it was effective, for now my compiler is still using the original format (the new format breaks compatibility with my previous loaders). Can note that, for example, with building Doom for RV64G: 445K, ".text" 42K, ".rodata" 141K, ".data" 18K, ".got" skip, various smaller sections skip ".bss" (1442K), not present in binary. Dynamic stuff: 75K, ".dynsym" 43K, ".dynstr" 49K, ".rela.dyn" 32K, ".rela.plt" 21K, ".plt" 21K, ".hash" 25K, ".gnu.hash" So: 646K, stuff that is present either way. 266K, dynamic linking metadata. It was smaller in its non-PIE form, but PIE kinda ruined it. This is with "-ffunction-sections -fdata-sections -Wl,-gc-sections" otherwise it would have been bigger. Comparison, Doom built for BJX2-XG2 (with a PE/COFF variant): 283K, ".text" 22K, ".strtab" 1K, ".rodata" 2K, ".reloc" 142K, ".data" (not present in binary) 1274K, ".bss" This is for an image that still supports base relocations, using an ABI capable of NOMMU operation, and an ISA variant that does not use 16-bit ops. If I switch to "Baseline" mode (which still has 16-bit instructions), ".text" drops to around 250K. In all of these cases, it is the same program static-linked with the same C library. Generally, my ISA also seems to be winning in terms of performance in my tests. ========== REMAINDER OF ARTICLE TRUNCATED ==========