Deutsch   English   Français   Italiano  
<vjv5jd$2ds8r$3@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.lang.c
Subject: Re: transpiling to low level C
Date: Wed, 18 Dec 2024 12:51:23 -0600
Organization: A noiseless patient Spider
Lines: 227
Message-ID: <vjv5jd$2ds8r$3@dont-email.me>
References: <vjlh19$8j4k$1@dont-email.me>
 <vjn9g5$n0vl$1@raubtier-asyl.eternal-september.org>
 <vjnhsq$oh1f$1@dont-email.me> <vjnq5s$pubt$1@dont-email.me>
 <vjp2f3$13k4m$2@dont-email.me> <vjr7np$1j57r$2@dont-email.me>
 <vjsdum$1rfp2$1@dont-email.me> <vjsi62$1s5j5$2@dont-email.me>
 <vjsjll$1rlkq$3@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 18 Dec 2024 19:51:26 +0100 (CET)
Injection-Info: dont-email.me; posting-host="d6170756eb4c94ad5ca5e98ba35d9045";
	logging-data="2552091"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/FsF/FKxy9I+4A/qdSYrV1Fzo1hzOnYuU="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:oLqLpAfe9CR9p2pWxJFWWD0e2R0=
In-Reply-To: <vjsjll$1rlkq$3@dont-email.me>
Content-Language: en-US
Bytes: 8568

On 12/17/2024 1:33 PM, Thiago Adams wrote:
> Em 12/17/2024 4:07 PM, BGB escreveu:
>> On 12/17/2024 11:55 AM, Thiago Adams wrote:
>>> Em 12/17/2024 4:03 AM, BGB escreveu:
>>>> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>>>>> On 15/12/2024 20:53, BGB wrote:
>>>>>> On 12/15/2024 3:32 PM, bart wrote:
>>>>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>>>>> C++ is more readable because is is magnitudes more expressive 
>>>>>>>> than C.
>>>>>>>> You can easily write a C++-statement that would hunddres of 
>>>>>>>> lines in
>>>>>>>> C (imagines specializing a unordered_map by hand). Making a 
>>>>>>>> language
>>>>>>>> less expressive makes it even less readable, and that's also 
>>>>>>>> true for
>>>>>>>> your reduced C.
>>>>>>>>
>>>>>>>
>>>>>>> That's not really the point of it. This reduced C is used as an 
>>>>>>> intermediate language for a compiler target. It will not usually 
>>>>>>> be read, or maintained.
>>>>>>>
>>>>>>> An intermediate language needs to at a lower level than the 
>>>>>>> source language.
>>>>>>>
>>>>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>>>>
>>>>>>> Generating C++ would be quite useless.
>>>>>>>
>>>>>>
>>>>>> As an IL, even C is a little overkill, unless turned into a 
>>>>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>>>>
>>>>>> Say:
>>>>>>    Only function-scope variables allowed;
>>>>>>    No high-level control structures;
>>>>>>    ...
>>>>>>
>>>>>> Say:
>>>>>>    int foo(int x)
>>>>>>    {
>>>>>>      int i, v;
>>>>>>      for(i=x, v=0; i>0; i--)
>>>>>>        v=v*i;
>>>>>>      return(v);
>>>>>>    }
>>>>>>
>>>>>> Becoming, say:
>>>>>>    int foo(int x)
>>>>>>    {
>>>>>>      int i;
>>>>>>      int v;
>>>>>>      i=x;
>>>>>>      v=0;
>>>>>>      if(i<=0)goto L1;
>>>>>>      L0:
>>>>>>      v=v*i;
>>>>>>      i=i-1;
>>>>>>      if(i>0)goto L0;
>>>>>>      L1:
>>>>>>      return v;
>>>>>>    }
>>>>>>
>>>>>> ...
>>>>>>
>>>>>
>>>>> I have considered to remove loops and keep only goto.
>>>>> But I think this is not bring too much simplification.
>>>>>
>>>>
>>>> It depends.
>>>>
>>>> If the compiler works like an actual C compiler, with a full parser 
>>>> and AST stage, yeah, it may not save much.
>>>>
>>>>
>>>> If the parser is a thin wrapper over 3AC operations (only allowing 
>>>> statements that map 1:1 with a 3AC IR operation), it may save a bit 
>>>> more...
>>>>
>>>>
>>>>
>>>> As for whether or not it makes sense to use a C like syntax here, 
>>>> this is more up for debate (for practical use within a compiler, I 
>>>> would assume a binary serialization rather than an ASCII syntax, 
>>>> though ASCII may be better in terms of inter-operation or human 
>>>> readability).
>>>>
>>>>
>>>> But, as can be noted, I would assume a binary serialization that is 
>>>> oriented around operators; and *not* about serializing the 
>>>> structures used to implement those operators. Also I would assume 
>>>> that the IR need not be in SSA form (conversion to full SSA could be 
>>>> done when reading in the IR operations).
>>>>
>>>>
>>>> Ny argument is that not using SSA form means fewer issues for both 
>>>> the serialization format and compiler front-end to need to deal with 
>>>> (and is comparably easy to regenerate for the backend, with the 
>>>> backend operating with its internal IR in SSA form).
>>>>
>>>> Well, contrast to LLVM assuming everything is always in SSA form.
>>>>
>>>> ...
>>>>
>>>>
>>>
>>> I also have considered split expressions.
>>>
>>> For instance
>>>
>>> if (a*b+c) {}
>>>
>>> into
>>>
>>> register int r1 = a * b;
>>> register int r2 = r1 + c;
>>> if (r2) {}
>>>
>>> This would make easier to add overflow checks in runtime (if desired) 
>>> and implement things like _complex
>>>
>>> Is this what you mean by 3AC or SSA?
>>>
>>
>> 3AC means that IR expressed 3 (or sometimes more) operands per IR op.
>>
>> So:
>>    MUL r1, a, b
>> Rather than, say, stack:
>>    LOAD a
>>    LOAD b
>>    MUL
>>    STORE r1
>>
>>
>> SSA:
>>    Static Single Assignment
>>
> 
> Oh sorry .. I knew what SSA is.
> 
>> Generally:
>> Every variable may only be assigned once (more like in a functional 
>> programming language);
>> Generally, variables are "merged" in the control-flow via PHI 
>> operators (which variable merges in depending on which path control 
>> came from).
>>
> 
> I do similar merge in my flow analysis but without the concept of SSA.
> 
>> IMHO, while SSA is preferable for backend analysis, optimization, and 
>> code generation; it is undesirable pretty much everywhere else as it 
>> adds too much complexity.
>>
>> Better IMO for the frontend compiler and main IL stage to assume that 
>> local variables are freely mutable.
>>
>> Typically, global variables are excluded in most variants, and remain 
>> fully mutable; but may be handled as designated LOAD/STORE operations.
>>
>>
>> In BGBCC though, full SSA only applies to temporaries. Normal local 
>> variables are merely flagged by "version", and all versions of the 
>> same local variable implicitly merge back together at each branch/label.
>>
> 
> Sorry what is BGBCC ? (C compiler?)
> 

It is my C compiler.

========== REMAINDER OF ARTICLE TRUNCATED ==========