Deutsch English Français Italiano |
<v6d5k0$6rk5$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!3.eu.feeder.erje.net!feeder.erje.net!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB <cr88192@gmail.com> Newsgroups: comp.lang.c Subject: =?UTF-8?Q?Re=3A_technology_discussion_=E2=86=92_does_the_world_need?= =?UTF-8?B?IGEgIm5ldyIgQyA/?= Date: Sat, 6 Jul 2024 23:28:36 -0500 Organization: A noiseless patient Spider Lines: 253 Message-ID: <v6d5k0$6rk5$1@dont-email.me> References: <v66eci$2qeee$1@dont-email.me> <v67gt1$2vq6a$2@dont-email.me> <v687h2$36i6p$1@dont-email.me> <871q48w98e.fsf@nosuchdomain.example.com> <v68dsm$37sg2$1@dont-email.me> <87plrsultu.fsf@bsb.me.uk> <v68sft$3a6lh$1@dont-email.me> <87ed87v4wi.fsf@bsb.me.uk> <v6adrm$3ljg6$1@dont-email.me> <87v81ita77.fsf@bsb.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Sun, 07 Jul 2024 06:29:54 +0200 (CEST) Injection-Info: dont-email.me; posting-host="f7c16b2fffd87ab0346ebf4c8e11259c"; logging-data="224901"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19qS2SlIxyi1yihx/d2uNeHn0/n2ivVHxM=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:dqTfkEyXbEAvisg1mxY5Pxjh0ds= In-Reply-To: <87v81ita77.fsf@bsb.me.uk> Content-Language: en-US Bytes: 10138 On 7/6/2024 5:41 PM, Ben Bacarisse wrote: > BGB <cr88192@gmail.com> writes: > >> On 7/5/2024 5:40 PM, Ben Bacarisse wrote: >>> BGB <cr88192@gmail.com> writes: >>> >>>> On 7/5/2024 6:20 AM, Ben Bacarisse wrote: >>>>> BGB <cr88192@gmail.com> writes: > >>>>>> While eliminating structs could also simplify things; structs also tend to >>>>>> be a lot more useful. >>>>> Indeed. And I'd have to use them for this! >>>>> >>>> >>>> Errm, the strategy I would assume is, as noted: >>>> int a[4][4]; >>>> ... >>>> l=a[j][k]; >>>> Becomes: >>>> int a[16]; >>>> ... >>>> l=a[j*4+k]; >>> That's what you want to force me to write, but I can use and array of >>> arrays despite your arbitrary ban on them by simply putting the array in >>> a struct. > ... >> IN most contexts, I don't really see how a struct is preferable to a >> multiply, but either way... > > And I can't see how an array of arrays is harder for your compiler than > an array of structs. C's indexing requires the compiler to know that > size of the items pointed to. > > I suspect that there is something amiss with your design if you are > considering this limiting in order to simplify the compiler. A simple > compiler should not care what kind of thing p points to in > > p[i] > > only what size of object p points to. > When I designed the compiler code, the initial approach for internal type layout was to bit-pack it into 32 bits, say (a lot of this is from memory, so maybe wrong): Basic1 (31:28): Layout of Type (0=Basic) (27:16): Array Size (15:12): Pointer Level Count (11: 0): Base Type Basic2 (31:28): Layout of Type (1=Basic2) (27: 8): Array Size ( 7: 6): Pointer Level Count ( 5: 0): Base Type Basic3 (31:28): Layout of Type (2=Basic3) (27:24): Array Size (23:20): Pointer Level Count (19: 0): Base Type Overflow (31:28): Layout of Type (3=Overflow) (27:24): MBZ (23: 0): Index into Type-Overflow Table And, a few other cases... Basic1 was the default, able to express arrays from 0..4095 elements, with 0..7 levels of pointer indirection, and 0..4095 for the base type. Where, 0=T, 1=T*, 2=T**, ..., 7=T******* 8=T[], 9=T[][], A=T*[], B=T*[*], C=&T, ... Note that at present, there is no way to express more than 7 levels of pointer indirection, but this issue hasn't come up in practice. Basic2 is for big arrays of a primitive type, 0..3 pointer levels. May only encode low-numbered primitive types. Basic3 is the opposite, able to express a wider range of types, but only small arrays. There is another variant of Basic1 that splits the Array Size field in half, with a smaller array limit, but able to encode const/volatile/restrict/etc (but only in certain combinations). Overflow would be used if the type couldn't fit into one of the above, the type is then expressed in a table. It is avoided when possible, as overflow entry tables are comparably expensive. Type Numbering space: 0.. 63: Primitive Types, Higher priority 64.. 255: Primitive Types, Lower priority 256 .. 4095: Complex Types, Index into Literals Table 4096..1048575: Complex Types, Index into Literals Table Small numbered base types were higher priority: 00=Int, 01=Long(64bit), 02=Float, 03=Double, 04=Ptr(void*), 05=Void, 06=Struct(Abstract), 07=NativeLong 08=SByte, 09=UByte, 0A=Short, 0B=UShort, 0C=UInt, 0D=ULong, 0E=UNativeLong, 0F=ImplicitInt Followed by, say: 10=Int128, 11=UInt128, 12=Float128/LongDouble, 13=Float16, ... Where, Type Number 256 would map to index 0 in the Literal Table. An index into the literals table will generally be used to encode a Struct or Function Pointer or similar. This table will hold a structure describing the fields of a struct, or the arguments and return value of a function pointer (in my BS2 language, it may also define class members, a superclass, implemented interfaces, ...). It could also be used to encode another type, which was needed for things like multidimensional arrays and some other complex types. But, this seemed like an ugly hack... (And was at odds with how I imagined types working, but seemed like a technical necessity). These would often be packed inside of a 64-bit register/variable descriptor. Local Variables: (63:56): Descriptor Type (55:24): Variable Type (23:12): Sequence Number (11: 0): Identity Number Global Variables: (63:56): Descriptor Type (55:24): Variable Type (23: 0): Index into Global Table Integer Literal: (63:56): Descriptor Type (55:32): Compact Type (31: 0): Value String Literal: (63:56): Descriptor Type (55:32): Compact Type (31: 0): Offset into String Table There were various other types, representing larger integer and floating point types: Long and Double literals, representing the value as 56 bits Low 8 bits cut off for Double An index into a table of raw 64-bit values (if it can't be expressed directly as one of the other options). Values for 128-bit types were expressed as an index pair into the table of 64-bit values: (63:56): Descriptor Type (55:48): Primitive Type (47:24): Index into Value Table (High 64 bits) (23: 0): Index into Value Table (Low 64 bits) One downside as-is, is that if a given variable is assigned more than 4096 times in a given function, it can no longer be given a unique ID. Though uncommon, this is not entirely implausible (with sufficiently large functions), and there isn't currently any good way to deal with this (apart from raising a compiler error). This can happen potentially in large functions. Taking away bits from the base ID isn't good either, as functions pushing 1000+ local variables aren't entirely implausible either (though, thus far, not really seen any with much more than around 400 local variables, but still...). ========== REMAINDER OF ARTICLE TRUNCATED ==========