Deutsch English Français Italiano |
<v34thr$letg$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: David Brown <david.brown@hesbynett.no> Newsgroups: comp.lang.c Subject: Re: xxd -i vs DIY Was: C23 thoughts and opinions Date: Tue, 28 May 2024 17:34:19 +0200 Organization: A noiseless patient Spider Lines: 73 Message-ID: <v34thr$letg$1@dont-email.me> References: <v2l828$18v7f$1@dont-email.me> <00297443-2fee-48d4-81a0-9ff6ae6481e4@gmail.com> <v2lji1$1bbcp$1@dont-email.me> <87msoh5uh6.fsf@nosuchdomain.example.com> <f08d2c9f-5c2e-495d-b0bd-3f71bd301432@gmail.com> <v2nbp4$1o9h6$1@dont-email.me> <v2ng4n$1p3o2$1@dont-email.me> <87y18047jk.fsf@nosuchdomain.example.com> <87msoe1xxo.fsf@nosuchdomain.example.com> <v2sh19$2rle2$2@dont-email.me> <87ikz11osy.fsf@nosuchdomain.example.com> <v2v59g$3cr0f$1@dont-email.me> <20240528144118.00002012@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Tue, 28 May 2024 17:34:19 +0200 (CEST) Injection-Info: dont-email.me; posting-host="d21593105ce034ab606a049d42e9f782"; logging-data="703408"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18dQya/UFeXBZZWZ0fnTehzCY2AbPNvhjQ=" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Cancel-Lock: sha1:GIAYNLFfuGXXaoiKizgiAZdfBsU= Content-Language: en-GB In-Reply-To: <20240528144118.00002012@yahoo.com> Bytes: 4074 On 28/05/2024 13:41, Michael S wrote: > On Sun, 26 May 2024 13:09:36 +0200 > David Brown <david.brown@hesbynett.no> wrote: > >> >> No, it does /not/. That's the /whole/ point of #embed, and the main >> motivation for its existence. People have always managed to embed >> binary source files into their binary output files - using linker >> tricks, or using xxd or other tools (common or specialised) to turn >> binary files into initialisers for constant arrays (or structs). >> I've done so myself on many projects, all integrated together in >> makefiles. >> > > Let's start another round of private parts' measurements turnament! > 'xxd -i' vs DIY > I used 100 MB of random data: dd if=/dev/urandom bs=1M count=100 of=100MB I compiled your code with "gcc-11 -O2 -march=native". I ran everything in a tmpfs filesystem, completely in ram. xxd took 5.4 seconds - that's the baseline. Your simple C code took 4.35 seconds. Your second program took 0.9 seconds - a big improvement. One line of Python code took 8 seconds : print(", ".join([hex(b) for b in open("100MB", "rb").read()])) A slightly nicer Python program took 14.3 seconds : import sys bs = open(sys.argv[1], "rb").read() xs = "".join([" 0x%02x," % b for b in bs]) ln = len(xs) print("\n".join([xs[i : i + 72] for i in range(0, ln, 72)])) Like "xxd -i", that one split the output into lines of 12 bytes. Some compilers might not like a single 300-600 MB line ! I didn't try compiling a test file from the 100 MB source data, but gcc took about 16 seconds for an include file generated from 20 MB of random data. It didn't make a significant difference if the data was in decimal or hex, one line or multiple lines. But since compilation took about ten times as long as the single line of Python code, my conclusion is that the speed of generating the include file is pretty much irrelevant. Compared to the one-line Python code and considering the generation and compilation combined, using xxd saves 5% of the time and your best code saves 9% - out of possible 10% cost saving. Thus if you want to save build time when including large arrays of data in the generated executable, time spent on beating xxd is wasted - implementing optimised #embed is the only way to make an impact. (I have had reason to include a 0.5 MB file in a statically linked single binary - I'm not sure when you'd need very fast handling of multi-megabyte embeds.)