Article <v2sh19$2rle2$2@dont-email.me>

Deutsch English Français Italiano
<v2sh19$2rle2$2@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.lang.c
Subject: Re: C23 thoughts and opinions
Date: Sat, 25 May 2024 13:11:37 +0200
Organization: A noiseless patient Spider
Lines: 82
Message-ID: <v2sh19$2rle2$2@dont-email.me>
References: <v2l828$18v7f$1@dont-email.me>
 <00297443-2fee-48d4-81a0-9ff6ae6481e4@gmail.com>
 <v2lji1$1bbcp$1@dont-email.me> <87msoh5uh6.fsf@nosuchdomain.example.com>
 <f08d2c9f-5c2e-495d-b0bd-3f71bd301432@gmail.com>
 <v2nbp4$1o9h6$1@dont-email.me> <v2ng4n$1p3o2$1@dont-email.me>
 <87y18047jk.fsf@nosuchdomain.example.com>
 <87msoe1xxo.fsf@nosuchdomain.example.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 25 May 2024 13:11:38 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="5acf865375c1c1cdc2a566d884dbbc5b";
	logging-data="3003842"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18wXsxEmTw26VaCFWI5YV1HNfwkpnE9i2Q="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:pvVDTQb9DhjUWP10lVDvThihhnA=
Content-Language: en-GB
In-Reply-To: <87msoe1xxo.fsf@nosuchdomain.example.com>
Bytes: 3998

On 25/05/2024 03:29, Keith Thompson wrote:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> David Brown <david.brown@hesbynett.no> writes:
>>> On 23/05/2024 14:11, bart wrote:
>> [...]
>>>> 'embed' was discussed a few months ago. I disagreed with the poor
>>>> way it was to be implemented: 'embed' notionally generates a list of
>>>> comma-separated numbers as tokens, where you have to take care of
>>>> any trailing zero yourself if needed. It would also be hopelessly
>>>> inefficient if actually implemented like that.
>>>
>>> Fortunately, it is /not/ actually implemented like that - it is only
>>> implemented "as if" it were like that.  Real prototype implementations
>>> (for gcc and clang - I don't know about other tools) are extremely
>>> efficient at handling #embed.  And the comma-separated numbers can be
>>> more flexible in less common use-cases.
>> [...]
>>
>> I'm aware of a proposed implementation for clang:
>>
>> https://github.com/llvm/llvm-project/pull/68620
>> https://github.com/ThePhD/llvm-project
>>
>> I'm currently cloning the git repo, with the aim of building it so I can
>> try it out and test some corner cases.  It will take a while.
>>
>> I'm not aware of any prototype implementation for gcc.  If you are, I'd
>> be very interested in trying it out.
>>
>> (And thanks for starting this thread!)
> 
> I've built this from source, and it mostly works.  I haven't seen it do
> any optimization; the `#embed` directive expands to a sequence of
> comma-separated integer constants.
> 
> Which means that this:
> 
> #include <stdio.h>
> int main(void) {
>      struct foo {
>          unsigned char a;
>          unsigned short b;
>          unsigned int c;
>          double d;
>      };
>      struct foo obj = {
> #embed "foo.dat"
>      };
>      printf("a=%d b=%d c=%d d=%f\n", obj.a, obj.b, obj.c, obj.d);
> }
> 
> given "foo.dat" containing bytes with values 1, 2, 3, and 4, produces
> this output:
> 
> a=1 b=2 c=3 d=4.000000
> 

That is what you would expect by the way #embed is specified.  You would 
not expect to see any "optimisation", since optimisations should not 
change the results (apparent from choosing between alternative valid 
results).

Where you will see the optimisation difference is between :

	const int xs[] = {
#embed "x.dat"
	};

and

	const int xs[] = {
#include "x.csv"
	};


where "x.dat" is a large binary file, and "x.csv" is the same data as 
comma-separated values.  The #embed version will compile very much 
faster, using far less memory.  /That/ is the optimisation.