Deutsch English Français Italiano |
<20240326023103.00004ea0@yahoo.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Michael S <already5chosen@yahoo.com> Newsgroups: comp.lang.c Subject: Re: A Famous Security Bug Date: Tue, 26 Mar 2024 01:31:03 +0200 Organization: A noiseless patient Spider Lines: 166 Message-ID: <20240326023103.00004ea0@yahoo.com> References: <bug-20240320191736@ram.dialup.fu-berlin.de> <20240320114218.151@kylheku.com> <20240321211306.779b21d126e122556c34a346@gmail.moc> <utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me> <utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me> <utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me> <utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me> <utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com> <utrqgp$12v02$1@dont-email.me> <uts7e0$1686i$1@dont-email.me> <20240325195118.0000333a@yahoo.com> <utsemf$18477$1@dont-email.me> <20240326000501.00007d6d@yahoo.com> <utsq47$1atlm$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Injection-Date: Tue, 26 Mar 2024 00:31:06 +0100 (CET) Injection-Info: dont-email.me; posting-host="fdc93945bf4086afcea95294ad40c436"; logging-data="1445476"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18HV0cA+SMYGuUTLwJtTTfjZENREPl1RNA=" Cancel-Lock: sha1:/WshBP4uV2ddl5KUyJF8kUYtfzA= X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32) Bytes: 8162 On Mon, 25 Mar 2024 21:25:27 +0000 bart <bc@freeuk.com> wrote: > On 25/03/2024 21:05, Michael S wrote: > > On Mon, 25 Mar 2024 18:10:23 +0000 > > bart <bc@freeuk.com> wrote: > > =20 > >> On 25/03/2024 16:51, Michael S wrote: =20 > >>> On Mon, 25 Mar 2024 16:06:24 +0000 > >>> bart <bc@freeuk.com> wrote: > >>> =20 > >>>> On 25/03/2024 12:26, David Brown wrote: =20 > >>>>> On 25/03/2024 12:16, Michael S wrote: =20 > >>>>>> On Sun, 24 Mar 2024 23:43:32 +0100 > >>>>>> David Brown <david.brown@hesbynett.no> wrote: =20 > >>>>>>> > >>>>>>> I could be=A0 wrong here, of course. > >>>>>>> =20 > >>>>>> > >>>>>> It seems, you are. > >>>>>> =20 > >>>>> > >>>>> It happens - and it was not unexpected here, as I said.=A0 I don't > >>>>> have all these compilers installed to test. > >>>>> > >>>>> But it would be helpful if you had a /little/ more information. > >>>>> If you don't know why some compilers generate binaries that have > >>>>> memory mapped at 0x400000, and others do not, fair enough.=A0 I am > >>>>> curious, but it's not at all important. > >>>>> =20 > >>>> > >>>> In the PE EXE format, the default image load base is specified > >>>> in a special header in the file: > >>>> > >>>> Magic: 20B > >>>> Link version: 1.0 > >>>> Code size: 512 200 > >>>> Idata size: 1024 400 > >>>> Zdata size: 512 > >>>> Entry point: 4096 1000 in data:0 > >>>> Code base: 4096 > >>>> Image base: 4194304 400000 > >>>> Section align: 4096 > >>>> > >>>> By convention it is at 0x40'0000 (I've no idea why). > >>>> > >>>> More recently, dynamic loading, regardless of what it says in the > >>>> PE header, has become popular with linkers. So, while there is > >>>> still a fixed value in the Image Base file, which might be > >>>> 0x140000000, it gets loaded at some random address, usually in > >>>> high memory above 2GB. > >>>> > >>>> I don't know what's responsible for that, but presumably the OS > >>>> must be in on the act. > >>>> > >>>> To make this possible, both for loading above 2GB, and for > >>>> loading at an address not known by the linker, the code inside > >>>> the EXE must be position-independent, and have relocation info > >>>> for any absolute 64-bit static addresses. 32-bit static > >>>> addresses won't work. =20 > >>> > >>> I don't understand why you say that EXE must be > >>> position-independent. I never learned PE format in depth (and > >>> learned only absolute minimum of elf, just enough to be able to > >>> load images in simple embedded scenario), but my impression always > >>> was that PE EXE contains plenty of relocation info for a loader, > >>> so it (loader) can modify (I think professional argot uses the > >>> word 'fix') non-PIC at load time to run at any chosen position. > >>> Am I wrong about it? =20 > >> > >> > >> A PE EXE designed to run only at the image base given won't be > >> position-independent, so it can't be moved anywwhere else. > >> > >> There isn't enough info to make it possible, especially before > >> position-independent addressing modes for x64 came along (that is, > >> using offset to the RIP intruction pointer instead of 32-bit > >> absolute addresses). > >> > >> Take this C program: > >> > >> int abc; > >> int* ptr =3D &abc; > >> > >> int main(void) { > >> int x; > >> x =3D abc; > >> } > >> > >> Some of the assembly generated is this: > >> > >> abc: resb 4 > >> > >> ptr: dq abc > >> ... > >> mov eax, [abc] > >> > >> That last reference is an absolute 32-bit address, for example it > >> might have address 0x00403000 when loaded at 0x400000. > >> > >> If the program is instead loaded at 0x78230000, there is no reloc > >> info to tell it that that particular 32-bit value, plus the 64-bit > >> field initialising ptr, must be adjusted. > >> > >> RIP-relative addressing (I think sometimes called PIC), can fix > >> that second reference: > >> > >> mov eax, [rip:abc] > >> > >> But it only works for code, not data; that initialisation is still > >> absolute. > >> > >> When a DLL is generated instead, those will need to be moved (to > >> avoid multiple DLLs all based at the same address). In that case, > >> base-relocation tables are needed: a list of addresses that > >> contain a field that needs relocating, and what type and size of > >> reloc is needed. > >> > >> The same info is needed for EXE if it contains flags saying that > >> the EXE could be loaded at an arbitrary adddress. > >> =20 > >=20 > > Your explanation exactly matches what I was imagining. > > The technology for relocation of non-PIC code is already here, in > > file format definitions and in OS loader code. The linker or the > > part of compiler that serves the role of linker can decide to not > > generate required tables. Operation in such mode will have small > > benefits in EXE size and in quicker load time, but IMHO nowadays it > > should be used rarely, only in special situations rather than serve > > as a default of the tool. =20 >=20 > There are two aspects to be considered: >=20 > * Relocating a program to a different address below 2GB >=20 > * Relocating a program to any address including above 2GB >=20 > The first can be accommodated with tables derived from the reloc info > of object files. >=20 > But the second requires compiler cooperation in generating code that=20 > will work above 2GB. >=20 > Part of that can be done with RIP-relative address modes as I touched=20 > on. But not all; RIP-relative won't work here: >=20 > movsx rax, dword [i] > mov rax, [rbx*8 + abc] >=20 > where the address works with registers. This requires something like: >=20 > lea rcx, [rip:abc] # or mov rcx, abc (64-bit abs addr) > mov rax, [rbx*8 + rcx] >=20 > This is specific to x64, but other processors will have their issues.=20 > Like ARM64 which doesn't even have the 32-bit displayment used with > rip here. >=20 You mean, when compiler knows that the program is loaded at low address and when combined data segments are relatively small then compiler can use zero-extended or sign-extended 32-bit literals to form 64-bit addresses of static/global objects?=20 ========== REMAINDER OF ARTICLE TRUNCATED ==========