Deutsch   English   Français   Italiano  
<20240526193549.000031a8@yahoo.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!feeds.phibee-telecom.net!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Michael S <already5chosen@yahoo.com>
Newsgroups: comp.lang.c
Subject: Re: C23 thoughts and opinions
Date: Sun, 26 May 2024 19:35:49 +0300
Organization: A noiseless patient Spider
Lines: 242
Message-ID: <20240526193549.000031a8@yahoo.com>
References: <v2l828$18v7f$1@dont-email.me>
	<00297443-2fee-48d4-81a0-9ff6ae6481e4@gmail.com>
	<v2lji1$1bbcp$1@dont-email.me>
	<87msoh5uh6.fsf@nosuchdomain.example.com>
	<f08d2c9f-5c2e-495d-b0bd-3f71bd301432@gmail.com>
	<v2nbp4$1o9h6$1@dont-email.me>
	<v2ng4n$1p3o2$1@dont-email.me>
	<87y18047jk.fsf@nosuchdomain.example.com>
	<87msoe1xxo.fsf@nosuchdomain.example.com>
	<v2sh19$2rle2$2@dont-email.me>
	<87ikz11osy.fsf@nosuchdomain.example.com>
	<v2v59g$3cr0f$1@dont-email.me>
	<v2v7ni$3d70v$1@dont-email.me>
	<20240526161832.000012a6@yahoo.com>
	<v2vka0$3f4a2$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Injection-Date: Sun, 26 May 2024 18:35:40 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="748cc89cfd4b455fba2588d67f703cb1";
	logging-data="3503865"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19iInhn9p0bpJQWwNV0xo5chefK8ECGH1U="
Cancel-Lock: sha1:FrVPQ+y7JSP6MTPfDxVCJxJQXFM=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
Bytes: 9104

On Sun, 26 May 2024 16:25:51 +0100
bart <bc@freeuk.com> wrote:

> On 26/05/2024 14:18, Michael S wrote:
> > On Sun, 26 May 2024 12:51:12 +0100
> > bart <bc@freeuk.com> wrote:
> >  =20
> >> On 26/05/2024 12:09, David Brown wrote: =20
> >>> On 26/05/2024 00:58, Keith Thompson wrote: =20
> >> =20
> >>>> For a very large file, that could be a significant burden.=C2=A0 (I
> >>>> don't have any numbers on that.) =20
> >>>
> >>> I do :
> >>>
> >>> <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm#design-ef=
ficiency-metrics>
> >>>
> >>> (That's from a proposal for #embed for C and C++.=C2=A0 Generating the
> >>> numbers and parsing them is akin to using xxd.)
> >>>
> >>> More useful links:
> >>>
> >>> <https://thephd.dev/embed-the-details#results>
> >>> <https://thephd.dev/implementing-embed-c-and-c++>
> >>>
> >>> (These are from someone who did a lot of the work for the
> >>> proposals, and prototype implementations, as far as I understand
> >>> it.)
> >>>
> >>>
> >>>
> >>> Note that I can't say how much of a difference this will make in
> >>> real life.=C2=A0 I don't know how often people need to include
> >>> multi-megabyte files in their code.=C2=A0 It certainly is not at a
> >>> level where I would change any of my existing projects from
> >>> external generator scripts to using #embed, but I might use it in
> >>> future projects. =20
> >>
> >> I've just done my own quick test (not in C, using embed in my
> >> language):
> >>
> >>       []byte clangexe =3D binclude("f:/llvm/bin/clang.exe")
> >>
> >>       proc main=3D
> >>           fprintln "clang.exe is # bytes", clangexe.len
> >>       end
> >>
> >>
> >> This embeds the Clang C compiler which is 119MB. It took 1.3
> >> seconds to compile (note my compiler is not optimised).
> >>
> >> If I tried it using text: a 121M-line include file, with one number
> >> per line, it took 144 seconds (I believe it used more RAM than was
> >> available: each line will have occupied a 64-byte AST node, so
> >> nearly 8GB, on a machine with only 6GB available RAM, much of
> >> which was occupied). =20
> >=20
> > On my old PC that was not the cheapest box in the shop, but is more
> > than 10 y.o. compilation speed for similarly organized (but much
> > smaller) text files is as following:
> > MSVC 18.00.31101 (VS 2013) - 1950 KB/sec
> > MSVC 19.16.27032 (VS 2017) - 1180 KB/sec
> > MSVC 19.20.27500 (VS 2019) - 1180 KB/sec
> > clang 17.0.6 - 547 KB/sec (somewhat better with hex text)
> > gcc 13.2.0 - 580 KB/sec
> >=20
> > So, MSVC compilers, esp. an old one, are somewhat faster than yours.
> > But if there was swapping involved it's not comparable. How much
> > time does it take for your compiler to produce 5MB byte array from
> > text? =20
>=20
> Are you talking about a 5MB array initialised like this:
>=20
> unsigned char data[] =3D {
>     45,
>     67,
>     17,
>     ...            // 5M-3 more rows
> };
>=20

Yes.

> The timing for 120M entries was challenging as it exceeded physical=20
> memory. However that test I can also do with C compilers. Results for=20
> 120 million lines of data are:
>=20
>    DMC          -    Out-of-memory
>=20
>    Tiny C       -    Silently stopped after 13 second (I thought it
> had finished but no)
>=20
>    lccwin32     -    Insufficient memory
>=20
>    gcc 10.x.x   -    Out of memory after 80 seconds
>=20
>    mcc          -    (My product) Memory failure after 27 seconds
>=20
>    Clang        -    (Crashed after 5 minutes)
>=20
>    MM         144s   (Compiler for my language)
>=20
> So the compiler for my language did quite well, considering!
>

That's an interesting test as well, but I don't want to run it on my HW
right now. May be, at night.

>=20
> Back to the 5MB test:
>=20
>    Tiny C     1.7s    2.9MB/sec (Tcc doesn't use any IR)
>=20
>    mcc        3.7s    1.3MB/sec (my product; uses intermediate ASM)

Faster than new MSVC, but slower than old MSVC.

>=20
>    DMC        --      --        (Out of memory; 32-bit compiler)
>=20
>    lccwin32   3.9s    1.3MB/sec
>=20
>    gcc 10.x  10.6s    0.5MB/sec
>=20
>    clang      7.4s    0.7MB/sec (to object file only)
>=20
>    MM         1.4s    3.6MB/sec (compiler for my language)
>=20
>    MM         0.7     7.1MB/sec (MM optimised via C and gcc-O3)
>

That's quite impressive.=20
Does it generate object files or goes directly to exe?
Even if later, it's still impressive.
=20
> As a reminder, when using my version of 'embed' in my language,=20
> embedding a 120MB binary file took 1.3 seconds, about 90MB/second.
>=20
>=20
> > But both are much faster than compiling through text. Even "slow"
> > 40MB/3 is 6-7 times faster than the fastest of compilers in my
> > tests. =20
>=20
> Do you have a C compiler that supports #embed?
>

No, I just blindly believe the paper.
But it probably would be available in clang this year and in gcc around
start of the next year. At least I hope so.

> It's generally understood that processing text is slow, if
> representing byte-at-a-time data. If byte arrays could be represented
> as sequences of i64 constants, it would improve matters. That could
> be done in C, but awkwardly, by aliasing a byte-array with an
> i64-array.
>=20

I don't think that conversion from text to binary is a significant
bottleneck here. In order to get a feeling of the things, I wrote a
tiny program that converts comma-separated list of integers to a binary
file. Something quite similar to 'xxd -r' but with input format that
is more fit to our requirements. Not identical to full requirements, of
course. My utility can't handle comments and probably few other things
that are allowed in C sources, but conversion part is pretty much the
same.
It runs at 6.700 MB/s with decimal input and at 9.1 MB/s with hex input.
========== REMAINDER OF ARTICLE TRUNCATED ==========