Deutsch English Français Italiano |
<v2vugh$3gso8$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Malcolm McLean <malcolm.arthur.mclean@gmail.com> Newsgroups: comp.lang.c Subject: Re: C23 thoughts and opinions Date: Sun, 26 May 2024 19:19:59 +0100 Organization: A noiseless patient Spider Lines: 248 Message-ID: <v2vugh$3gso8$1@dont-email.me> References: <v2l828$18v7f$1@dont-email.me> <00297443-2fee-48d4-81a0-9ff6ae6481e4@gmail.com> <v2lji1$1bbcp$1@dont-email.me> <87msoh5uh6.fsf@nosuchdomain.example.com> <f08d2c9f-5c2e-495d-b0bd-3f71bd301432@gmail.com> <v2nbp4$1o9h6$1@dont-email.me> <v2ng4n$1p3o2$1@dont-email.me> <87y18047jk.fsf@nosuchdomain.example.com> <87msoe1xxo.fsf@nosuchdomain.example.com> <v2sh19$2rle2$2@dont-email.me> <87ikz11osy.fsf@nosuchdomain.example.com> <v2v59g$3cr0f$1@dont-email.me> <v2v7ni$3d70v$1@dont-email.me> <20240526161832.000012a6@yahoo.com> <v2vka0$3f4a2$1@dont-email.me> <20240526193549.000031a8@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sun, 26 May 2024 20:20:01 +0200 (CEST) Injection-Info: dont-email.me; posting-host="f1f7027a6e2fd039a913684bf6925e86"; logging-data="3699464"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/a6yEaZ6UhTIHkGlSjIEHuXB8CPWDvwl0=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:Owh/F2ayO3Be3FBFVQpIyAPtOqE= Content-Language: en-GB In-Reply-To: <20240526193549.000031a8@yahoo.com> Bytes: 9796 On 26/05/2024 17:35, Michael S wrote: > On Sun, 26 May 2024 16:25:51 +0100 > bart <bc@freeuk.com> wrote: > >> On 26/05/2024 14:18, Michael S wrote: >>> On Sun, 26 May 2024 12:51:12 +0100 >>> bart <bc@freeuk.com> wrote: >>> >>>> On 26/05/2024 12:09, David Brown wrote: >>>>> On 26/05/2024 00:58, Keith Thompson wrote: >>>> >>>>>> For a very large file, that could be a significant burden. (I >>>>>> don't have any numbers on that.) >>>>> >>>>> I do : >>>>> >>>>> <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm#design-efficiency-metrics> >>>>> >>>>> (That's from a proposal for #embed for C and C++. Generating the >>>>> numbers and parsing them is akin to using xxd.) >>>>> >>>>> More useful links: >>>>> >>>>> <https://thephd.dev/embed-the-details#results> >>>>> <https://thephd.dev/implementing-embed-c-and-c++> >>>>> >>>>> (These are from someone who did a lot of the work for the >>>>> proposals, and prototype implementations, as far as I understand >>>>> it.) >>>>> >>>>> >>>>> >>>>> Note that I can't say how much of a difference this will make in >>>>> real life. I don't know how often people need to include >>>>> multi-megabyte files in their code. It certainly is not at a >>>>> level where I would change any of my existing projects from >>>>> external generator scripts to using #embed, but I might use it in >>>>> future projects. >>>> >>>> I've just done my own quick test (not in C, using embed in my >>>> language): >>>> >>>> []byte clangexe = binclude("f:/llvm/bin/clang.exe") >>>> >>>> proc main= >>>> fprintln "clang.exe is # bytes", clangexe.len >>>> end >>>> >>>> >>>> This embeds the Clang C compiler which is 119MB. It took 1.3 >>>> seconds to compile (note my compiler is not optimised). >>>> >>>> If I tried it using text: a 121M-line include file, with one number >>>> per line, it took 144 seconds (I believe it used more RAM than was >>>> available: each line will have occupied a 64-byte AST node, so >>>> nearly 8GB, on a machine with only 6GB available RAM, much of >>>> which was occupied). >>> >>> On my old PC that was not the cheapest box in the shop, but is more >>> than 10 y.o. compilation speed for similarly organized (but much >>> smaller) text files is as following: >>> MSVC 18.00.31101 (VS 2013) - 1950 KB/sec >>> MSVC 19.16.27032 (VS 2017) - 1180 KB/sec >>> MSVC 19.20.27500 (VS 2019) - 1180 KB/sec >>> clang 17.0.6 - 547 KB/sec (somewhat better with hex text) >>> gcc 13.2.0 - 580 KB/sec >>> >>> So, MSVC compilers, esp. an old one, are somewhat faster than yours. >>> But if there was swapping involved it's not comparable. How much >>> time does it take for your compiler to produce 5MB byte array from >>> text? >> >> Are you talking about a 5MB array initialised like this: >> >> unsigned char data[] = { >> 45, >> 67, >> 17, >> ... // 5M-3 more rows >> }; >> > > Yes. > >> The timing for 120M entries was challenging as it exceeded physical >> memory. However that test I can also do with C compilers. Results for >> 120 million lines of data are: >> >> DMC - Out-of-memory >> >> Tiny C - Silently stopped after 13 second (I thought it >> had finished but no) >> >> lccwin32 - Insufficient memory >> >> gcc 10.x.x - Out of memory after 80 seconds >> >> mcc - (My product) Memory failure after 27 seconds >> >> Clang - (Crashed after 5 minutes) >> >> MM 144s (Compiler for my language) >> >> So the compiler for my language did quite well, considering! >> > > That's an interesting test as well, but I don't want to run it on my HW > right now. May be, at night. > >> >> Back to the 5MB test: >> >> Tiny C 1.7s 2.9MB/sec (Tcc doesn't use any IR) >> >> mcc 3.7s 1.3MB/sec (my product; uses intermediate ASM) > > Faster than new MSVC, but slower than old MSVC. > >> >> DMC -- -- (Out of memory; 32-bit compiler) >> >> lccwin32 3.9s 1.3MB/sec >> >> gcc 10.x 10.6s 0.5MB/sec >> >> clang 7.4s 0.7MB/sec (to object file only) >> >> MM 1.4s 3.6MB/sec (compiler for my language) >> >> MM 0.7 7.1MB/sec (MM optimised via C and gcc-O3) >> > > That's quite impressive. > Does it generate object files or goes directly to exe? > Even if later, it's still impressive. > >> As a reminder, when using my version of 'embed' in my language, >> embedding a 120MB binary file took 1.3 seconds, about 90MB/second. >> >> >>> But both are much faster than compiling through text. Even "slow" >>> 40MB/3 is 6-7 times faster than the fastest of compilers in my >>> tests. >> >> Do you have a C compiler that supports #embed? >> > > No, I just blindly believe the paper. > But it probably would be available in clang this year and in gcc around > start of the next year. At least I hope so. > >> It's generally understood that processing text is slow, if >> representing byte-at-a-time data. If byte arrays could be represented >> as sequences of i64 constants, it would improve matters. That could >> be done in C, but awkwardly, by aliasing a byte-array with an >> i64-array. >> > > I don't think that conversion from text to binary is a significant > bottleneck here. In order to get a feeling of the things, I wrote a > tiny program that converts comma-separated list of integers to a binary > file. Something quite similar to 'xxd -r' but with input format that > is more fit to our requirements. Not identical to full requirements, of > course. My utility can't handle comments and probably few other things > that are allowed in C sources, but conversion part is pretty much the > same. > It runs at 6.700 MB/s with decimal input and at 9.1 MB/s with hex input. > That with SATA SSD of sort that went out of fashion before 2020. > > So, it seems that at least in case gcc a conversion part constitutes ========== REMAINDER OF ARTICLE TRUNCATED ==========