Article <vj1foo$335q1$1@dont-email.me>

Deutsch English Français Italiano
<vj1foo$335q1$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bart <bc@freeuk.com>
Newsgroups: comp.lang.c
Subject: Re: else ladders practice
Date: Sat, 7 Dec 2024 12:40:57 +0000
Organization: A noiseless patient Spider
Lines: 195
Message-ID: <vj1foo$335q1$1@dont-email.me>
References: <3deb64c5b0ee344acd9fbaea1002baf7302c1e8f@i2pn2.org>
 <vhic66$1thk0$1@dont-email.me> <vhins8$1vuvp$1@dont-email.me>
 <vhj7nc$2svjh$1@paganini.bofh.team> <vhje8l$2412p$1@dont-email.me>
 <86y117qhc8.fsf@linuxsc.com> <vi2m3o$2vspa$1@dont-email.me>
 <86cyiiqit8.fsf@linuxsc.com> <vi4iji$3f7a3$1@dont-email.me>
 <86mshkos1a.fsf@linuxsc.com> <20241128143715.00003565@yahoo.com>
 <via21q$ib2v$2@dont-email.me> <vihmss$29jun$2@paganini.bofh.team>
 <vihuev$2isep$1@dont-email.me> <vj01eu$3u86n$1@paganini.bofh.team>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 07 Dec 2024 13:40:57 +0100 (CET)
Injection-Info: dont-email.me; posting-host="92f4e16e4a21192eb5e2d0ba1a6ec304";
	logging-data="3249985"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+iyeD5vY3AxZ1HZG9GFiXq"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:k2TW0BZF2pGmCxbmgJsIa00ZEbE=
In-Reply-To: <vj01eu$3u86n$1@paganini.bofh.team>
Content-Language: en-GB
Bytes: 8538

On 06/12/2024 23:30, Waldek Hebisch wrote:
> Bart <bc@freeuk.com> wrote:

>> (For example, I got tcc.c working at one point. My generated tcc.exe
>> could compile tcc.c, but that second-generation tcc.c didn't work.)
> 
> Clear, you work in stages: first you find out what is wrong with
> second-generation tcc.exe.

Ha, ha, ha!

While C /can/ written reasonably clearly, tcc sources are more typical. 
Very dense, mixed-up lower and upper case everywhere, apparent over-use 
of macros, eq:

     for_each_elem(symtab_section, 1, sym, ElfW(Sym)) {
         if (sym->st_shndx == SHN_UNDEF) {
             name = (char *) symtab_section->link->data + sym->st_name;
             sym_index = find_elf_sym(s1->dynsymtab_section, name);

If I was looking to develop this product then it might be worth spending 
days or weeks learning how it all works. But it's not worth mastering 
this codebase inside out just to discover I wrote 0 instead of 1 
somewhere in my compiler.

I need whatever error it is to manifest itself in a simpler way. Or have 
two versions (eg. one interpreted the other native code) that give 
different results. The problem with this app is that those different 
results appear too far down the line; I don't want to trace a billion 
instructions first.

So, when I get back to it, I'll test other open source C code. (The 
annoying thing though is that either it won't compile for reasons I've 
lost interest in, or it works completely fine.)

>> In
>> my interpreter, it grows downwards!)
> 
> You probably meant upwards?

Yes.

>  And handling such things is natural
> when you have portablity in mind, either you parametrise stdarg.h
> so that it works for both stack directions, or you make sure that
> interpreter and compiler use the same direction (the later seem to
> be much easier).

This is quite a tricky one actually. There is currently conditional code 
in my stdarg.h that detects whether the compiler has set a flag saying 
result will be interpreted. But it doesn't always know that.

For example, the compiler might be told to do -E (preprocess) and the 
result compiled later. The stack direction is baked into the output.

Or it will do -p (generate discrete IL), where it doesn't know whether 
that will be interpreted.

But this is not a serious issue; the interpreted option is for either 
debugging or novelty uses.


>  Actually, I think that most natural way is to
> have data structure layout in the interpreter to be as close as
> possible to compiler data layout.

I don't want my hand forced in this. The point of interpreting is to be 
independent of hardware. A downward growing stack is unnatural.

>> They'd have to use it from the start. But then they may want to use
>> libraries which only work with gcc ...
>   
> Well, you see that there are reasons to use 'gcc'.

Self-perpetuating ones, which are the wrong reasons.


> Next version was cross-compiled on Linux using gcc.  This version
> used inline assembly for rounding and was significantly faster
> than what Borland C produced.  Note: images to process were
> largish (think of say 12000 by 20000 pixels) and speed was
> important factor.  So using 'gcc' specific code was IMO justified
> (this code was used conditionally, other compilers would get
> slow portable version using 'floor').

I have a little image editor written entirely in interpreted code. (It 
was supposed to a project that was mixed language, but that's some way off.)

However it is just about usable. Eg. inverting the colours (negative to 
positive etc) of a 6Mpix colour image takes 1/8th of a second. Splitting 
into separate R,G,B 8-bit planes takes half a second. This is with 
bytecode working on a pixel at a time.

It uses no optimised code in the interpreter. Only a mildly accelerated 
dispatcher.

>   
>>> You need to improve your propaganda for faster C compilers...
>>
>> I actually don't know why I care. I get the benefit of my fast tools
>> every day; they're a joy to use. So I'm not bothered that other people
>> are that tolerant of slow, cumbersome build systems.
>>
>> But then, people in this group do like to belittle small, fast products
>> (tcc for example as well as my stuff), and that's where it gets annoying.
> 
> I tried tcc compiling TeX.  Long ago it did not work due to limitations
> of tcc.  This time it worked.  Small comparison on main file (19062
> lines):
> 
> Command           time              size code    size data
> tcc -g            0.017              290521        1188
> tcc               0.015              290521        1188
> gcc -O0 -g        0.440              248467          14
> gcc -O0           0.413              248467          14

This is demonstrating that tcc is translating C code at over 1 million 
lines per second, and generating binary code at 17MB per second. You're 
not impressed by that?

Here are a couple of reasonably substantial one-file programs that can 
be run, both interpreters:

https://github.com/sal55/langs/blob/master/lua.c

This is a one-file Lua interpreter, which I modified to take input from 
a file. (For original, see comment at start.)

On my machine, these are typical results:

   gcc -s -O3 14    secs  378KB  3.0 secs  (compile-time, size, runtime)
   gcc -s -O0  3.3  secs  372KB 10.0 secs
   tcc         0.12 secs  384KB  8.5 secs
   cc          0.14 secs  315KB  8.3 secs

The runtime refers to running this Fibonacci test (fib.lua):

  function fibonacci(n)
      if n<3 then
          return 1
      else
          return fibonacci(n-1) + fibonacci(n-2)
      end
  end

  for n = 1, 36 do
      f=fibonacci(n)
      io.write(n," ",f, "\n")
  end

The one is a version of my interpreter, minus ASM acceleration, 
transpiled to C, and for Linux:

https://github.com/sal55/langs/blob/master/qc.c

Compile using for example:

   gcc qc.c -oqc -fno-builtin -lm -ldl
   tcc qc.c -oqc -fdollars-in-identifiers -lm -ldl

The input there can be (fib.q):

   func fib(n)=
       if n<3 then
           1
       else
           fib(n-1)+fib(n-2)
       fi
   end

   for i to 36 do
       println i,fib(i)
========== REMAINDER OF ARTICLE TRUNCATED ==========