Article <v2vugh$3gso8$1@dont-email.me>

Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <v2vugh$3gso8$1@dont-email.me>
Deutsch English Français Italiano
<v2vugh$3gso8$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Malcolm McLean <malcolm.arthur.mclean@gmail.com>
Newsgroups: comp.lang.c
Subject: Re: C23 thoughts and opinions
Date: Sun, 26 May 2024 19:19:59 +0100
Organization: A noiseless patient Spider
Lines: 248
Message-ID: <v2vugh$3gso8$1@dont-email.me>
References: <v2l828$18v7f$1@dont-email.me>
 <00297443-2fee-48d4-81a0-9ff6ae6481e4@gmail.com>
 <v2lji1$1bbcp$1@dont-email.me> <87msoh5uh6.fsf@nosuchdomain.example.com>
 <f08d2c9f-5c2e-495d-b0bd-3f71bd301432@gmail.com>
 <v2nbp4$1o9h6$1@dont-email.me> <v2ng4n$1p3o2$1@dont-email.me>
 <87y18047jk.fsf@nosuchdomain.example.com>
 <87msoe1xxo.fsf@nosuchdomain.example.com> <v2sh19$2rle2$2@dont-email.me>
 <87ikz11osy.fsf@nosuchdomain.example.com> <v2v59g$3cr0f$1@dont-email.me>
 <v2v7ni$3d70v$1@dont-email.me> <20240526161832.000012a6@yahoo.com>
 <v2vka0$3f4a2$1@dont-email.me> <20240526193549.000031a8@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 26 May 2024 20:20:01 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="f1f7027a6e2fd039a913684bf6925e86";
	logging-data="3699464"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/a6yEaZ6UhTIHkGlSjIEHuXB8CPWDvwl0="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Owh/F2ayO3Be3FBFVQpIyAPtOqE=
Content-Language: en-GB
In-Reply-To: <20240526193549.000031a8@yahoo.com>
Bytes: 9796

On 26/05/2024 17:35, Michael S wrote:
> On Sun, 26 May 2024 16:25:51 +0100
> bart <bc@freeuk.com> wrote:
> 
>> On 26/05/2024 14:18, Michael S wrote:
>>> On Sun, 26 May 2024 12:51:12 +0100
>>> bart <bc@freeuk.com> wrote:
>>>    
>>>> On 26/05/2024 12:09, David Brown wrote:
>>>>> On 26/05/2024 00:58, Keith Thompson wrote:
>>>>   
>>>>>> For a very large file, that could be a significant burden.  (I
>>>>>> don't have any numbers on that.)
>>>>>
>>>>> I do :
>>>>>
>>>>> <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm#design-efficiency-metrics>
>>>>>
>>>>> (That's from a proposal for #embed for C and C++.  Generating the
>>>>> numbers and parsing them is akin to using xxd.)
>>>>>
>>>>> More useful links:
>>>>>
>>>>> <https://thephd.dev/embed-the-details#results>
>>>>> <https://thephd.dev/implementing-embed-c-and-c++>
>>>>>
>>>>> (These are from someone who did a lot of the work for the
>>>>> proposals, and prototype implementations, as far as I understand
>>>>> it.)
>>>>>
>>>>>
>>>>>
>>>>> Note that I can't say how much of a difference this will make in
>>>>> real life.  I don't know how often people need to include
>>>>> multi-megabyte files in their code.  It certainly is not at a
>>>>> level where I would change any of my existing projects from
>>>>> external generator scripts to using #embed, but I might use it in
>>>>> future projects.
>>>>
>>>> I've just done my own quick test (not in C, using embed in my
>>>> language):
>>>>
>>>>        []byte clangexe = binclude("f:/llvm/bin/clang.exe")
>>>>
>>>>        proc main=
>>>>            fprintln "clang.exe is # bytes", clangexe.len
>>>>        end
>>>>
>>>>
>>>> This embeds the Clang C compiler which is 119MB. It took 1.3
>>>> seconds to compile (note my compiler is not optimised).
>>>>
>>>> If I tried it using text: a 121M-line include file, with one number
>>>> per line, it took 144 seconds (I believe it used more RAM than was
>>>> available: each line will have occupied a 64-byte AST node, so
>>>> nearly 8GB, on a machine with only 6GB available RAM, much of
>>>> which was occupied).
>>>
>>> On my old PC that was not the cheapest box in the shop, but is more
>>> than 10 y.o. compilation speed for similarly organized (but much
>>> smaller) text files is as following:
>>> MSVC 18.00.31101 (VS 2013) - 1950 KB/sec
>>> MSVC 19.16.27032 (VS 2017) - 1180 KB/sec
>>> MSVC 19.20.27500 (VS 2019) - 1180 KB/sec
>>> clang 17.0.6 - 547 KB/sec (somewhat better with hex text)
>>> gcc 13.2.0 - 580 KB/sec
>>>
>>> So, MSVC compilers, esp. an old one, are somewhat faster than yours.
>>> But if there was swapping involved it's not comparable. How much
>>> time does it take for your compiler to produce 5MB byte array from
>>> text?
>>
>> Are you talking about a 5MB array initialised like this:
>>
>> unsigned char data[] = {
>>      45,
>>      67,
>>      17,
>>      ...            // 5M-3 more rows
>> };
>>
> 
> Yes.
> 
>> The timing for 120M entries was challenging as it exceeded physical
>> memory. However that test I can also do with C compilers. Results for
>> 120 million lines of data are:
>>
>>     DMC          -    Out-of-memory
>>
>>     Tiny C       -    Silently stopped after 13 second (I thought it
>> had finished but no)
>>
>>     lccwin32     -    Insufficient memory
>>
>>     gcc 10.x.x   -    Out of memory after 80 seconds
>>
>>     mcc          -    (My product) Memory failure after 27 seconds
>>
>>     Clang        -    (Crashed after 5 minutes)
>>
>>     MM         144s   (Compiler for my language)
>>
>> So the compiler for my language did quite well, considering!
>>
> 
> That's an interesting test as well, but I don't want to run it on my HW
> right now. May be, at night.
> 
>>
>> Back to the 5MB test:
>>
>>     Tiny C     1.7s    2.9MB/sec (Tcc doesn't use any IR)
>>
>>     mcc        3.7s    1.3MB/sec (my product; uses intermediate ASM)
> 
> Faster than new MSVC, but slower than old MSVC.
> 
>>
>>     DMC        --      --        (Out of memory; 32-bit compiler)
>>
>>     lccwin32   3.9s    1.3MB/sec
>>
>>     gcc 10.x  10.6s    0.5MB/sec
>>
>>     clang      7.4s    0.7MB/sec (to object file only)
>>
>>     MM         1.4s    3.6MB/sec (compiler for my language)
>>
>>     MM         0.7     7.1MB/sec (MM optimised via C and gcc-O3)
>>
> 
> That's quite impressive.
> Does it generate object files or goes directly to exe?
> Even if later, it's still impressive.
>   
>> As a reminder, when using my version of 'embed' in my language,
>> embedding a 120MB binary file took 1.3 seconds, about 90MB/second.
>>
>>
>>> But both are much faster than compiling through text. Even "slow"
>>> 40MB/3 is 6-7 times faster than the fastest of compilers in my
>>> tests.
>>
>> Do you have a C compiler that supports #embed?
>>
> 
> No, I just blindly believe the paper.
> But it probably would be available in clang this year and in gcc around
> start of the next year. At least I hope so.
> 
>> It's generally understood that processing text is slow, if
>> representing byte-at-a-time data. If byte arrays could be represented
>> as sequences of i64 constants, it would improve matters. That could
>> be done in C, but awkwardly, by aliasing a byte-array with an
>> i64-array.
>>
> 
> I don't think that conversion from text to binary is a significant
> bottleneck here. In order to get a feeling of the things, I wrote a
> tiny program that converts comma-separated list of integers to a binary
> file. Something quite similar to 'xxd -r' but with input format that
> is more fit to our requirements. Not identical to full requirements, of
> course. My utility can't handle comments and probably few other things
> that are allowed in C sources, but conversion part is pretty much the
> same.
> It runs at 6.700 MB/s with decimal input and at 9.1 MB/s with hex input.
> That with SATA SSD of sort that went out of fashion before 2020.
> 
> So, it seems that at least in case gcc a conversion part constitutes
========== REMAINDER OF ARTICLE TRUNCATED ==========