Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <vjsjll$1rlkq$3@dont-email.me>
Deutsch   English   Français   Italiano  
<vjsjll$1rlkq$3@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder9.news.weretis.net!news.quux.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Thiago Adams <thiago.adams@gmail.com>
Newsgroups: comp.lang.c
Subject: Re: transpiling to low level C
Date: Tue, 17 Dec 2024 16:33:09 -0300
Organization: A noiseless patient Spider
Lines: 187
Message-ID: <vjsjll$1rlkq$3@dont-email.me>
References: <vjlh19$8j4k$1@dont-email.me>
 <vjn9g5$n0vl$1@raubtier-asyl.eternal-september.org>
 <vjnhsq$oh1f$1@dont-email.me> <vjnq5s$pubt$1@dont-email.me>
 <vjp2f3$13k4m$2@dont-email.me> <vjr7np$1j57r$2@dont-email.me>
 <vjsdum$1rfp2$1@dont-email.me> <vjsi62$1s5j5$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 17 Dec 2024 20:33:10 +0100 (CET)
Injection-Info: dont-email.me; posting-host="1f012199d928ca914dffdfea9ee32a88";
	logging-data="1955482"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/7Yvy5hDQUC6gllMHMlbV9S4Jr4BlsEww="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:37lSvedm3jk55EcTgVyp72/Ebwk=
Content-Language: en-GB
In-Reply-To: <vjsi62$1s5j5$2@dont-email.me>
Bytes: 7192

Em 12/17/2024 4:07 PM, BGB escreveu:
> On 12/17/2024 11:55 AM, Thiago Adams wrote:
>> Em 12/17/2024 4:03 AM, BGB escreveu:
>>> On 12/16/2024 5:21 AM, Thiago Adams wrote:
>>>> On 15/12/2024 20:53, BGB wrote:
>>>>> On 12/15/2024 3:32 PM, bart wrote:
>>>>>> On 15/12/2024 19:08, Bonita Montero wrote:
>>>>>>> C++ is more readable because is is magnitudes more expressive 
>>>>>>> than C.
>>>>>>> You can easily write a C++-statement that would hunddres of lines in
>>>>>>> C (imagines specializing a unordered_map by hand). Making a language
>>>>>>> less expressive makes it even less readable, and that's also true 
>>>>>>> for
>>>>>>> your reduced C.
>>>>>>>
>>>>>>
>>>>>> That's not really the point of it. This reduced C is used as an 
>>>>>> intermediate language for a compiler target. It will not usually 
>>>>>> be read, or maintained.
>>>>>>
>>>>>> An intermediate language needs to at a lower level than the source 
>>>>>> language.
>>>>>>
>>>>>> And for this project, it needs to be compilable by any C89 compiler.
>>>>>>
>>>>>> Generating C++ would be quite useless.
>>>>>>
>>>>>
>>>>> As an IL, even C is a little overkill, unless turned into a 
>>>>> restricted subset (say, along similar lines to GCC's GIMPLE).
>>>>>
>>>>> Say:
>>>>>    Only function-scope variables allowed;
>>>>>    No high-level control structures;
>>>>>    ...
>>>>>
>>>>> Say:
>>>>>    int foo(int x)
>>>>>    {
>>>>>      int i, v;
>>>>>      for(i=x, v=0; i>0; i--)
>>>>>        v=v*i;
>>>>>      return(v);
>>>>>    }
>>>>>
>>>>> Becoming, say:
>>>>>    int foo(int x)
>>>>>    {
>>>>>      int i;
>>>>>      int v;
>>>>>      i=x;
>>>>>      v=0;
>>>>>      if(i<=0)goto L1;
>>>>>      L0:
>>>>>      v=v*i;
>>>>>      i=i-1;
>>>>>      if(i>0)goto L0;
>>>>>      L1:
>>>>>      return v;
>>>>>    }
>>>>>
>>>>> ...
>>>>>
>>>>
>>>> I have considered to remove loops and keep only goto.
>>>> But I think this is not bring too much simplification.
>>>>
>>>
>>> It depends.
>>>
>>> If the compiler works like an actual C compiler, with a full parser 
>>> and AST stage, yeah, it may not save much.
>>>
>>>
>>> If the parser is a thin wrapper over 3AC operations (only allowing 
>>> statements that map 1:1 with a 3AC IR operation), it may save a bit 
>>> more...
>>>
>>>
>>>
>>> As for whether or not it makes sense to use a C like syntax here, 
>>> this is more up for debate (for practical use within a compiler, I 
>>> would assume a binary serialization rather than an ASCII syntax, 
>>> though ASCII may be better in terms of inter-operation or human 
>>> readability).
>>>
>>>
>>> But, as can be noted, I would assume a binary serialization that is 
>>> oriented around operators; and *not* about serializing the structures 
>>> used to implement those operators. Also I would assume that the IR 
>>> need not be in SSA form (conversion to full SSA could be done when 
>>> reading in the IR operations).
>>>
>>>
>>> Ny argument is that not using SSA form means fewer issues for both 
>>> the serialization format and compiler front-end to need to deal with 
>>> (and is comparably easy to regenerate for the backend, with the 
>>> backend operating with its internal IR in SSA form).
>>>
>>> Well, contrast to LLVM assuming everything is always in SSA form.
>>>
>>> ...
>>>
>>>
>>
>> I also have considered split expressions.
>>
>> For instance
>>
>> if (a*b+c) {}
>>
>> into
>>
>> register int r1 = a * b;
>> register int r2 = r1 + c;
>> if (r2) {}
>>
>> This would make easier to add overflow checks in runtime (if desired) 
>> and implement things like _complex
>>
>> Is this what you mean by 3AC or SSA?
>>
> 
> 3AC means that IR expressed 3 (or sometimes more) operands per IR op.
> 
> So:
>    MUL r1, a, b
> Rather than, say, stack:
>    LOAD a
>    LOAD b
>    MUL
>    STORE r1
> 
> 
> SSA:
>    Static Single Assignment
> 

Oh sorry .. I knew what SSA is.

> Generally:
> Every variable may only be assigned once (more like in a functional 
> programming language);
> Generally, variables are "merged" in the control-flow via PHI operators 
> (which variable merges in depending on which path control came from).
> 

I do similar merge in my flow analysis but without the concept of SSA.

> IMHO, while SSA is preferable for backend analysis, optimization, and 
> code generation; it is undesirable pretty much everywhere else as it 
> adds too much complexity.
> 
> Better IMO for the frontend compiler and main IL stage to assume that 
> local variables are freely mutable.
> 
> Typically, global variables are excluded in most variants, and remain 
> fully mutable; but may be handled as designated LOAD/STORE operations.
> 
> 
> In BGBCC though, full SSA only applies to temporaries. Normal local 
> variables are merely flagged by "version", and all versions of the same 
> local variable implicitly merge back together at each branch/label.
> 

Sorry what is BGBCC ? (C compiler?)

> This allows some similar advantages (for analysis and optimization) 
> while limiting some of the complexities. Though, this differs from 
> temporaries which are assumed to essentially fully disappear once they 
> go outside of the span in which they exist (albeit with an awkward case 
> to deal with temporaries that cross basic-block boundaries, which need 
> to actually "exist" in some semi-concrete form, more like local variables).
> 
> Note that unless the address is taken of a local variable, it need not 
========== REMAINDER OF ARTICLE TRUNCATED ==========