Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB Newsgroups: comp.arch Subject: Re: Stealing a Great Idea from the 6600 Date: Sun, 28 Apr 2024 23:45:45 -0500 Organization: A noiseless patient Spider Lines: 383 Message-ID: References: <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org> <58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org> <14a7b1b370c033c50ac77e3394ac1ea5@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Mon, 29 Apr 2024 06:45:49 +0200 (CEST) Injection-Info: dont-email.me; posting-host="7f06ccc94ee66b06f8f8e543c504a29c"; logging-data="1637595"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19jlG4y80iyPzK0xWeqN9GFI6jjULR37kI=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:qUMdRTdmbSZGoaXpaeP5u4nBtRY= Content-Language: en-US In-Reply-To: <14a7b1b370c033c50ac77e3394ac1ea5@www.novabbs.org> Bytes: 19239 On 4/28/2024 5:56 PM, MitchAlsup1 wrote: > BGB wrote: > >> On 4/28/2024 2:24 PM, MitchAlsup1 wrote: >>> >>>> Still going to need to explain the semantics here... >>> >>> IP+&GOT+disp-IP is a 64-bit pointer into GOT where the external linkage >>> pointer resides. > >> OK. > >> Not sure I follow here what exactly is going on... > > While I am sure I don't understand what is going on.... > >> As noted, if I did a similar thing to the RISC-V example, but with my >> own ISA (with the MOV.C extension): >>      MOV.Q (PC, Disp33), R0            // What data does this access ? >>      MOV.Q (GBR, 0), R18 >>      MOV.C (R18, R0), GBR > > It appears to me that you are placing an array of GOT pointers at the > first entry of any particular GOT ?!? > They are not really GOTs in my PBO ABI, but rather the start of every ".data" section. In this case, every ".data" section starts with a pointer to an array that holds a pointer to every other ".data" section in the process image, and every DLL is assigned an index in this array (except the main EXE, which always has an index value of 0). Every program instance exists relative to this array of ".data" sections. So, say, "Process 1" will have one version of this array, "Process 2" will have another, etc. And, all of the data sections in Process 1 will point to the array for Process 1. And, all of the data sections in Process 2 will point to the array for Process 2. And so on... So, even if all the ".text" sections are shared between "Process 1" and "Process 2" (with both existing within the same address space), because the data sections are separate; each has its own set of global variables, so the processes effectively don't see the other versions of the program running within the shared address space. In some sense, FDPIC is vague similar, but does use GOTs, but effectively daisy-chains all the GOTs together with all the other GOTs (having a GOT pointer for every function pointer in the GOT). But, as can be noted, this does add some overhead. In my case, I had wanted to do something similar to FDPIC in the sense of allowing multiple instances without needing to duplicate the read-only sections. But, I also wanted a lower performance overhead. > Whereas My 66000 uses IP relative access to the GOT the linker (or > LD.so) setup avoiding the indirection. > Then My 66000 does not have or need a pointer to GOT since it can > synthesize such a pointer at link time and then just use a IP relative > plus DISP32 to access said GOT. > This approach works so long as one has a one-to-one mapping between loaded binaries, and their associated sets of global variables (or, if each mapping exists in its own address space). Doesn't work so well for a many-to-one mapping within a shared address space. So, say, if you only have one instance of a binary, getting the GOT or data sections relative to PC/IP can work. But, with multiple instances, it does not work. The data sections can only be relative to the other data sections (or to the process context). Like, say, if you wanted to support a multitasking operating system on hardware that doesn't have either virtual memory or segments. Or, if one does have virtual memory, but wants to keep it as optional. Say, for example, uClinux... > So, say we have some external variables:: > >     extern uint64_t fred, wilma, barney, betty; > > AND we postulate that the linker found all 4 externs in the same module > so that it can access them all via 1 pointer. The linker assigns an > index into GOT and setups a relocation to that memory segment and when > LD.so runs, it stores a proper pointer in that index of GOT, call this > index fred_index. > > And we access one of these:: > >     if( fred at_work ) > > The compiler will obtain the pointer to the area fred is positioned via: > >     LDD    Rfp,[IP,,#GOT+fred_index<<3]        // * > And, the above is where the problem lies... Would be valid for ELF PIC or PIE binaries, but is not valid for PBO or FDPIC. > and from here one can access barney, betty and wilma using the pointer > to fred and standard offsetting. > >     LDD    Rfred,[Rfp,#0]     // fred >     LDD    Rbarn,[Rfp,#16]    // barney >     LDD    Rbett,[Rfp,#24]    // betty >     LDD    Rwilm,[Rfp,#8]     // wilma > > These offsets are known at link time and possibly not at compile time. > > (*) if the LDD through GOT takes a page fault, we have a procedure setup > so LD.so can run figure out which entry is missing, look up where it is > (possibly load and resolve it) and insert the required data into GOT. > When control returns to LDD, the entry is now present, and we now have > access to fred, wilma, barney and betty. > Yeah. >> Differing mostly in that it doesn't require base relocs. > >> The normal version in my case avoids the extra memory load, but uses a >> base reloc for the table index. > >> .... > > {{ // this looks like stuff that should be accessible to LD.so > >> Though, the reloc format is at least semi-dense, eg, for a block of >> relocs: >>    { DWORD rvaPage;   //address of page (4K) >>      DWORD szRelocs;  //size of relocs in block >>    } >> With each reloc encoded as a 16-bit entry: ========== REMAINDER OF ARTICLE TRUNCATED ==========