Article <vbdcmk$gug7$1@dont-email.me>

Deutsch English Français Italiano
<vbdcmk$gug7$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!feeds.phibee-telecom.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Re: Computer architects leaving Intel...
Date: Thu, 5 Sep 2024 17:51:57 -0500
Organization: A noiseless patient Spider
Lines: 183
Message-ID: <vbdcmk$gug7$1@dont-email.me>
References: <2024Aug30.161204@mips.complang.tuwien.ac.at>
 <vb00c2$150ia$1@dont-email.me>
 <505954890d8461c1f4082b1beecd453c@www.novabbs.org>
 <vb0kh2$12ukk$1@dont-email.me> <vb3smg$1ta6s$1@dont-email.me>
 <vb4q5o$12ukk$3@dont-email.me> <vb6a16$38aj5$1@dont-email.me>
 <vb7evj$12ukk$4@dont-email.me> <vb8587$3gq7e$1@dont-email.me>
 <vb91e7$3o797$1@dont-email.me> <vb9eeh$3q993$1@dont-email.me>
 <vb9l7k$3r2c6$2@dont-email.me> <vba26l$3te44$1@dont-email.me>
 <vbag2s$3vhih$1@dont-email.me> <vS3CO.10106$kow1.6330@fx35.iad>
 <vbauo3$18sj$1@dont-email.me>
 <a2d65085c90f4d93f3d70df3121adc59@www.novabbs.org>
 <vbbo4g$8j04$2@dont-email.me> <ljt9okF86o3U1@mid.individual.net>
 <vbcbkj$bd22$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 06 Sep 2024 00:52:04 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="2a4e4224af8ce80ce0ed3e4137aaa553";
	logging-data="555527"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18TEp/yMSpq/9v9TnhoGS0ufwze1x7ab3I="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:ZdXj+FYMFcbiC9EB0TyvGtnNiB0=
In-Reply-To: <vbcbkj$bd22$2@dont-email.me>
Content-Language: en-US
Bytes: 9456

On 9/5/2024 8:27 AM, David Brown wrote:
> On 05/09/2024 10:51, Niklas Holsti wrote:
>> On 2024-09-05 10:54, David Brown wrote:
>>> On 05/09/2024 02:56, MitchAlsup1 wrote:
>>>> On Thu, 5 Sep 2024 0:41:36 +0000, BGB wrote:
>>>>
>>>>> On 9/4/2024 3:59 PM, Scott Lurndal wrote:
>>>>>
>>>>> Say:
>>>>>    long z;
>>>>>    int x, y;
>>>>>    ...
>>>>>    z=x*y;
>>>>> Would auto-promote to long before the multiply.
>>>>
>>>> \I may have to use this as an example of C allowing the programmer
>>>> to shoot himself in the foot; promotion or no promotion.
>>>
>>> You snipped rather unfortunately here - it makes it look like this 
>>> was code that Scott wrote, and you've removed essential context by BGB.
>>>
>>>
>>> While I agree it is an example of the kind of code that people 
>>> sometimes write when they don't understand C arithmetic, I don't 
>>> think it is C-specific.  I can't think of any language off-hand where 
>>> expressions are evaluated differently depending on types used further 
>>> out in the expression.  Can you give any examples of languages where 
>>> the equivalent code would either do the multiplication as "long", or 
>>> give an error so that the programmer would be informed of their error?
>>
>>
>> The Ada language can work in both ways. If you just have:
>>
>>     z : Long_Integer;  -- Not a standard Ada type, but often provided.
>>     x, y : Integer;
>>     ...
>>     z := x * y;
>>
>> the compiler will inform you that the types in the assignment do not 
>> match: using the standard (predefined) operator "*", the product of 
>> two Integers gives an Integer, not a Long_Integer.
> 
> That seems like a safe choice.  C's implicit promotion of int to long 
> int can be convenient, but convenience is sometimes at odds with safety.
> 

A lot of time, implicit promotion will be the "safer" option than first 
doing an operation that overflows and then promoting.

Annoyingly, one can't really do the implicit promotion first and then 
promote afterwards, as there may be programs that actually rely on this 
particular bit of overflow behavior.

In effect, in my case, the promotion behavior ends up needing to depend 
on the language-mode (it is either this or maybe internally split the 
operators into widening or non-widening variants, which are selected 
when translating the AST into the IR stage).

Well, as opposed to dealing with the widening cases by emitting IR with 
an implicit casts added into the IR.


> 
>> If you add this definition to the code:
>>
>>     function "*" (Left, Right : Integer) return Long_Integer
>>     is (Long_Integer(Left) * Long_Integer(Right));
>>
>> the compiler sees that there is now /also/ an Integer * Integer => 
>> Long_Integer multiplication operator, and uses that. Function 
>> overloading in Ada can depend on the type expected of the result.
>>
> 
> You can make types in C++ that have this effect, but you have to make 
> them and use them consistently.  You can't overload operators on 
> standard types like that.
> 
>> Perhaps you asked for a language that worked like this "out of the 
>> box", without the programmer having to add things like the "*" 
>> function above, and then Ada would not qualify on the second 
>> alternative (automatic lengthening before multiplication, depending on 
>> the result type desired).
>>
> 
> I asked for either, and you gave me both :-)
> 
>>
>>> (I don't count personal one-person languages here.
>>
>> While Ada has low market penetration, I don't think it quite qualifies 
>> as a one-person language -- yet :-)
>>
> 

Ironically, my language isn't itself well specified or tested, but 
ironically, much of the core of the implementation is tested, in my 
case, for sake of the same code also being used for compiling C.


Well, and in some cases I went and added C23 proposed lambdas (which 
reused C++ syntax), mostly because BS2 already had lambdas (so, it was 
mostly a parser thing in this case). But, then, C23 didn't add lambdas.


But, ironically, I had ended up with two implementations of BS2:
   At first, I started prototyping it in BGBCC;
   But, then switched to a new VM for my 3D engine;
     I wanted to do "load from source" like Doom 3;
     Load from source would have been an issue with BGBCC;
     But, then still ended up mostly using static-compiled bytecode;
     Like, it adds overhead to compile stuff at startup time.
   Now, I am back mostly to BGBCC.

It is kinda sad in a way though, as the new VM did have a nicer AST 
system and a better designed bytecode format. I did since make the AST's 
in BGBCC "slightly less bad", but would be more work to redesign the 
bytecode (and efforts keep fizzling out as the existing bytecode "works 
good enough").

And, then one can argue, "if you are going to redesign the bytecode 
anyways, why not jump over to CIL / MSIL ?...", which could almost work 
except that the structure of the metadata is awkward/inflexible and does 
not map over well to BGBCC (which had used a more JVM-like structure), 
and was more designed for C# and is an awkward fit to C.

Well, also the way they are used is different, where BGBCC mostly uses 
RIL as a stand-in for object files and static libraries. And a "lazier 
but semi-good" option might be to stick to the existing RIL format, but 
have a separate RIL object per translation unit (rather than all of the 
TU's in the compiler invocation), with a "symbol manifest" (more like in 
".a" libraries). Then maybe pulling in objects as needed when it walks 
the reachable graph.

With the closest equivalent in MSVC being MSIL inside of COFF objects 
(or, in GCC land, GIMPLE code inside of object files). But, generally, 
there is no real expectation of object-file compatibility between compilers.



Though, more tempting to use WAD2 or WAD4 because the "!<arch>" format 
kinda sucks... (Well, and this was also sort of my plan in my stalled 
"new compiler" effort, which had intended to use tweaked WAD2 variants 
as both the object files and library files).

Might make sense to add a second FOURCC for "intention":
   struct Wad2aHeader_s {
     FOURCC magic;    //00 'WAD2'
     DWORD numLumps;  //04 number of directory entries
     DWORD ofsLumps;  //08 offset of lump directory in image
     DWORD ofsTags;   //0C: *
     DWORD resv0;     //10: Reserved (WAD4: Allocation Bitmap Offset)
     DWORD resv1;     //14: Reserved (WAD4: Allocation Bitmap Size)
     FOURCC intent1;  //18: Intention for image use.
     FOURCC intent2;  //1C: Intention sub-type
   };

*: WAD2: N/A / Zero
    WAD2A: FCC Tags and/or file extensions for lump types.
    WAD4: Offset of directory lookup hash table.

Normally, WAD2 lumps omit the file extension from the name, and use a 
1-byte type tag. The meanings of this tag bytes were hard-coded in Quake 
and Half-Life, but for my uses I added a table to map lump types to 
FOURCC's or 3-char file extensions.

In contrast, WAD4 has file extensions in the lump names (32 bytes, 
unlike the 16-byte names in WAD2). Also 64 byte directory entries rather 
========== REMAINDER OF ARTICLE TRUNCATED ==========