Article <vhic66$1thk0$1@dont-email.me>

Deutsch English Français Italiano
<vhic66$1thk0$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!news.quux.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bart <bc@freeuk.com>
Newsgroups: comp.lang.c
Subject: Re: else ladders practice
Date: Tue, 19 Nov 2024 15:51:33 +0000
Organization: A noiseless patient Spider
Lines: 246
Message-ID: <vhic66$1thk0$1@dont-email.me>
References: <3deb64c5b0ee344acd9fbaea1002baf7302c1e8f@i2pn2.org>
 <vg2ttn$3a4lk$1@dont-email.me> <vg33gs$3b8n5$1@dont-email.me>
 <vg358c$3bk7t$1@dont-email.me> <vg37nr$3bo0c$1@dont-email.me>
 <vg3b98$3cc8q$1@dont-email.me> <vg5351$3pada$1@dont-email.me>
 <vg62vg$3uv02$1@dont-email.me> <vgd3ro$2pvl4$1@paganini.bofh.team>
 <vgdc4q$1ikja$1@dont-email.me> <vgdt36$2r682$2@paganini.bofh.team>
 <vge8un$1o57r$3@dont-email.me> <vgpi5h$6s5t$1@paganini.bofh.team>
 <vgtsli$1690f$1@dont-email.me> <vhgr1v$2ovnd$1@paganini.bofh.team>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 19 Nov 2024 16:51:35 +0100 (CET)
Injection-Info: dont-email.me; posting-host="28ec3c8b0d9b34ddd41bdf01011f3baf";
	logging-data="2016896"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/zqJa2+ER3CpIJj791StCH"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:1OYqlywOGnpIbtg5thzYfd3bxa8=
Content-Language: en-GB
In-Reply-To: <vhgr1v$2ovnd$1@paganini.bofh.team>
Bytes: 12209

On 19/11/2024 01:53, Waldek Hebisch wrote:
> Bart <bc@freeuk.com> wrote:
>> On 10/11/2024 06:00, Waldek Hebisch wrote:
>>> Bart <bc@freeuk.com> wrote:
>>
>>>> I'd would consider a much elaborate one putting the onus on external
>>>> tools, and still having an unpredictable result to be the poor of the two.
>>>>
>>>> You want to create a language that is easily compilable, no matter how
>>>> complex the input.
>>>
>>> Normally time spent _using_ compiler should be bigger than time
>>> spending writing compiler.  If compiler gets enough use, it
>>> justifies some complexity.
>>
>> That doesn't add up: the more the compiler gets used, the slower it
>> should get?!
> 
> More complicated does not mean slower.  Binary search or hash tables
> are more complicated than linear search, but for larger data may
> be much faster.

That's not the complexity I had in mind. The 100-200MB sizes of 
LLVM-based compilers are not because they use hash-tables over linear 
search.

> More generaly, I want to minimize time spent by the programmer,
> that is _sum over all iterations leading to correct program_ of
> compile time and "think time".  Compiler that compiles slower,
> but allows less iterations due to better diagnostics may win.
> Also, humans perceive 0.1s delay almost like no delay at all.
> So it does not matter if single compilation step is 0.1s or
> 0.1ms.  Modern computers can do a lot of work in 0.1s.

What's the context of this 0.1 seconds? Do you consider it long or short?

My tools can generally build my apps from scratch in 0.1 seconds; big 
compilers tend to take a lot longer. Only Tiny C is in that ballpark.

So I'm failing to see your point here. Maybe you picked up that 0.1 
seconds from an earlier post of mine and are suggesting I ought to be 
able to do a lot more analysis within that time?

> Yes.  This may lead to some complexity.  Simple approach is to
> avoid obviously useless recompilation ('make' is doing this).
> More complicated approach may keep some intermediate data and
> try to "validate" them first.  If previous analysis is valid,
> then it can be reused.  If something significant changes, than
> it needs to be re-done.  But many changes only have very local
> effect, so at least theoretically re-using analyses could
> save substantial time.

I consider compilation: turning textual source code into a form that can 
be run, typically binary native code, to be a completely routine task 
that should be as simple and as quick as flicking a light switch.

While anything else that might be a deep analysis of that program I 
consider to be a quite different task. I'm not saying there is no place 
for it, but I don't agree it should be integrated into every compiler 
and always invoked.

>> Since now that last statement is the '0' value (any int value wil do).
>> What should my compiler report instead? What analysis should it be
>> doing? What would that save me from typing?
> 
> Currently in typed language that I use literal translation of
> the example hits a hole in checks, that is the code is accepted.
> 
> Concerning needed analyses: one thing needed is representation of
> type, either Pascal range type or enumeration type (the example
> is _very_ unatural because in modern programming magic numbers
> are avoided and there would be some symbolic representation
> adding meaning to the numbers).  Second, compiler must recognize
> that this is a "multiway switch" and collect conditions.

The example came from C. Even if written as a switch, C switches do not 
return values (and also are hard to even analyse as to which branch is 
which).

In my languages, switches can return values, and a switch written as the 
last statement of a function is considered to do so, even if each branch 
uses an explicit 'return'. Then, it will consider a missing ELSE a 'hole'.

It will not do any analysis of the range other than what is necessary to 
implement switch (duplicate values, span of values, range-checking when 
using jump tables).

So the language may require you to supply a dummy 'else x' or 'return 
x'; so what?

The alternative appears to be one of:

* Instead of 'else' or 'return', to write 'unreachable', which puts some
   trust, not in the programmer, but some person calling your function
   who does not have sight of the source code, to avoid calling it with
   invalid arguments

* Or relying on the variable capabilities of a compiler 'A', which might
   sometimes be able to determine that some point is not reached, but
   sometimes it can't. But when you use compiler 'B', it might have a
   different result.

I'll stick with my scheme, thanks!

>  Once
> you have such representation (which may be desirable for other
> reasons) it is easy to determine set of handled values.  More
> precisely, in this example we just have small number of discrete
> values.  More ambitious compiler may have list of ranges.
> If type also specifies list of values or list of ranges, then
> it is easy to check if all values of the type are handled.

The types are tyically plain integers, with ranges from 2**8 to 2**64. 
The ranges associated with application needs will be more arbitrary.

If talking about a language with ranged integer types, then there might 
be more point to it, but that is itself a can of worms. (It's hard to do 
without getting halfway to implementing Ada.)


>> You can't do this stuff with the compilers David Brown uses; I'm
>> guessing you can't do it with your prefered ones either.
> 
> To recompile the typed system I use (about 0.4M lines) on new fast
> machine I need about 53s.  But that is kind of cheating:
> - this time is for parallel build using 20 logical cores
> - the compiler is not in the language it compiles (but in untyped
>    vesion of it)
> - actuall compilation of the compiler is small part of total
>    compile time
> On slow machine compile time can be as large as 40 minutes.

40 minutes for 400K lines? That's 160 lines per second; how old is this 
machine? Is the compiler written in Python?


> An untyped system that I use has about 0.5M lines and recompiles
> itself in 16s on the same machine.  This one uses single core.
> On slow machine compile time may be closer to 2 minutes.

So 4K to 30Klps.

> Again, compiler compile time is only a part of build time.
> Actualy, one time-intensive part is creating index for included
> documentation.

Which is not going to be part of a routine build.

>  Another is C compilation for a library file
> (system has image-processing functions and low-level part of
> image processing is done in C).  Recomplation starts from
> minimal version of the system, rebuilding this minimal
> version takes 3.3s.

My language tools work on a whole program, where a 'program' is a single 
EXE or DLL file (or a single OBJ file in some cases).

A 'build' then turns N source files into 1 binary file. This is the task 
I am talking about.

A complete application may have several such binaries and a bunch of 
other stuff. Maybe some source code is generated by a script. This part 
is open-ended.

However each of my current projects is a single, self-contained binary 
by design.

> Anyway, I do not need cascaded recompilation than you present.
> Both system above have incermental compilation, the second one
> at statement/function level: it offers interactive prompt
> which takes a statement from the user, compiles it and immediately
> executes.  Such statement may define a function or perform compilation.
========== REMAINDER OF ARTICLE TRUNCATED ==========