Deutsch   English   Français   Italiano  
<vkgk0u$2bh1n$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.lang.c
Subject: Re: transpiling to low level C
Date: Wed, 25 Dec 2024 03:41:41 -0600
Organization: A noiseless patient Spider
Lines: 161
Message-ID: <vkgk0u$2bh1n$1@dont-email.me>
References: <vjlh19$8j4k$1@dont-email.me>
 <vjn9g5$n0vl$1@raubtier-asyl.eternal-september.org>
 <vjnhsq$oh1f$1@dont-email.me> <vjnq5s$pubt$1@dont-email.me>
 <vjpn29$17jub$1@dont-email.me> <86ikrdg6yq.fsf@linuxsc.com>
 <vk78it$77aa$1@dont-email.me> <vk8a0e$l8sq$1@paganini.bofh.team>
 <vk9q1p$oucu$1@dont-email.me> <vkb81n$14frj$1@dont-email.me>
 <20241223134008.000058cf@yahoo.com> <86frmedrof.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 25 Dec 2024 10:41:50 +0100 (CET)
Injection-Info: dont-email.me; posting-host="d0b6fc13f6ed5769626df891f2167fc1";
	logging-data="2475063"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18O4qKxrTT2Inpq4G3Buc0sCMIFc0ittXk="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:eqnXihhOufc2y/JF11QFiNZSprs=
Content-Language: en-US
In-Reply-To: <86frmedrof.fsf@linuxsc.com>
Bytes: 8176

On 12/23/2024 3:18 PM, Tim Rentsch wrote:
> Michael S <already5chosen@yahoo.com> writes:
> 
>> On Mon, 23 Dec 2024 09:46:46 +0100
>> David Brown <david.brown@hesbynett.no> wrote:
>>
>>> And Tim did not rule out using the standard library,
>>
>> Are you sure?
> 
> I explicitly called out setjmp and longjmp as being excluded.
> Based on that, it's reasonable to infer the rest of the
> standard library is allowed.
> 
> Furthermore I don't think it matters.  Except for a very small
> set of functions -- eg, fopen, fgetc, fputc, malloc, free --
> everything else in the standard library either isn't important
> for Turing Completeness or can be synthesized from the base
> set.  The functionality of fprintf(), for example, can be
> implemented on top of fputc and non-library language features.


If I were to choose a set of primitive functions, probably:
   malloc/free and/or realloc
     could define, say:
       malloc(sz) => realloc(NULL, sz)
       free(ptr) => realloc(ptr, 0)
     Maybe _msize and _mtag/..., but this is non-standard.
       With _msize, can implement realloc on top of malloc/free.

For basic IO:
   fopen, fclose, fseek, fread, fwrite

printf could be implemented on top of vsnprintf and fputs
   fputs can be implemented on top of fwrite (via strlen).
   With a temporary buffer buffer being used for the printed string.

....


Though, one may still end up with various other stuff over the interface 
as well. Though, the interface can be made open-ended if one has a 
GetInterface call or similar, which can request other interfaces given 
an ID, such as, FOURCC/EIGHTCC pair, a SIXTEENCC, or GUID (*1). IMHO, 
generally preferable over a "GetProcAddress" mechanism due to lower 
overheads; tough, with an annoyance that interface vtables generally 
have a fixed layout (generally can't really add or change anything 
without creating binary compatibility issues; so a lot of 
tables/structures need to be kept semi-frozen).

Though, APIs like DirectX had dealt with the issue of having version 
numbers for vtables and then one requests a specific version of the 
vtable (within the range of versions supported by the major version of 
DirectX). But, this is crufty.

*1: Say: QWORD qwMajor, QWORD qwMinor.
   qwMajor:
     Major ID (FOURCC, EIGHTCC)
     Or: First 8 bytes of SIXTEENCC or GUID
   qwMinor:
     SubID/Version (FOURCC or EIGHTCC)
     Second 8 bytes of SIXTEENCC or GUID.
   Where:
     High 32 bits are 0, assume FOURCC.
     Else, look at bits to determine EIGHTCC vs GUID.
     Assume if both are EIGHTCC, value represents a SIXTEENCC.
     Bit patterns for valid SIXTEENCCs vs GUIDs are mutually exclusive.
     Names make more sense for public interfaces.
       Leaving GUIDs mostly for private/internal interfaces.

Well, unlike Windows, where they use GUIDs for pretty much everything 
here (and also, I didn't bother with an IDL compiler; generally doing 
all this directly in C).


Well, and some wonk, like the exact contents of structures like 
BITMAPINFOHEADER being interpreted based on using biSize as a magic 
number (well, sometimes with other stuff glued onto the end, as 
understood based the use of the biCompression field), ...

But, it has held up well, this structure being almost as old as I am...



In a few cases, one might also take the option of using a "DriverProc()" 
style interface, where one provides a pair of context-dependent pointers 
and uses magic numbers to identify the desired operation, or, intermediate:
   (*ifvt)->QueryProc(ifvt, iHdl, lParm, pParm1, pParm2);
   (*ifvt)->ModifyProc(ifvt, iHdl, lParm, pParm1, pParm2);

Where, QueryProc is intended for non-destructive operations, and 
ModifyProc for destructive operations.
   iHdl: Context-dependent integer handle;
   lParm: Magic command number.
   pParm1/pParm2: Magic pointers, often:
     pParm1: Input data address;
     pParm2: Output data address.

Where, vtable is usually provided in "VT **" form, hence the need to 
deref the table before a method can be invoked.



Actually, some of this overlaps with how I had implemented the C library 
for DLLs in my project:
Only the main binary has the full C library;
DLL's generally use a C library which calls back to the main C library 
via a COM style interface (things like malloc/free and stdio calls are 
routed over this interface).

Note that this is partly because in my case:
1, DLLs only allow an acyclic dependency graph;
2, The mechanism does not currently allow sharing global variables;
3, There was a desire to allow dlopen/dlsym to dynamically load libraries.

1 & 3 mean that if a statically-linked C library is used for the main 
binary:
One needs to also statically link a C library to each DLL;
The C library needs to operate over a COM interface for shared interfaces.

Or, alternatively, that only a DLL may be used for the C library, and 
all DLLs would need to use the same C library DLL.


Note that neither 1 nor 2 traditionally apply with ELF Shared Objects 
(which usually both shared everything and allow for cyclic dependency 
graphs). But, traditionally ELF has other drawbacks, like needing to 
access variables and call functions via a GOT (which has higher overhead 
than direct calls, or accessing global variables as a fixed offset 
relative to a known base register, ...).

Note that having the kernel inject DLLs into a running process wouldn't 
really mix well with the way glibc approaches shared objects (where, it 
manages this stuff in userland, rather than having this left up to the 
kernel's program loader).

May not matter as much though as if providing an COM-like interface, one 
doesn't necessarily actually need dlopen/dlsym to be able to see the 
symbols in the library that the interface came from.


Where, in this case, COM-like interfaces may be used in ways that 
deviate from usual dependency ordering; and was more flexible. They are 
awkward to use directly, so it may make sense to provide C API wrappers 
(thus far, usually statically linked, but they can fetch the interfaces 
they need from the main C library or the OS).

Where, in my case, the OS interface is a mix of conventional syscalls 
and object-method-calls routed over the syscall interface (the target 
being either in the kernel or in another process; or the OS might load a 
DLL into the client process and return a process-local vtable).

If non-local, generally the method pointers are generic, and serve to 
forward the call over the syscall mechanism (the syscall interface being 
used in a somewhat different way from how it would be used in something 
like Linux; where Linux generally just does not do things this way...).


....