Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Michael S <already5chosen@yahoo.com>
Newsgroups: comp.lang.c
Subject: Re: xxd -i vs DIY Was: C23 thoughts and opinions
Date: Wed, 29 May 2024 09:21:09 +0300
Organization: A noiseless patient Spider
Lines: 56
Message-ID: <20240529092109.0000246b@yahoo.com>
References: <v2l828$18v7f$1@dont-email.me>
	<00297443-2fee-48d4-81a0-9ff6ae6481e4@gmail.com>
	<v2lji1$1bbcp$1@dont-email.me>
	<87msoh5uh6.fsf@nosuchdomain.example.com>
	<f08d2c9f-5c2e-495d-b0bd-3f71bd301432@gmail.com>
	<v2nbp4$1o9h6$1@dont-email.me>
	<v2ng4n$1p3o2$1@dont-email.me>
	<87y18047jk.fsf@nosuchdomain.example.com>
	<87msoe1xxo.fsf@nosuchdomain.example.com>
	<v2sh19$2rle2$2@dont-email.me>
	<87ikz11osy.fsf@nosuchdomain.example.com>
	<v2v59g$3cr0f$1@dont-email.me>
	<20240528144118.00002012@yahoo.com>
	<v34odg$kh7a$1@dont-email.me>
	<20240528185624.00002494@yahoo.com>
	<v359f1$nknu$1@dont-email.me>
	<20240528232315.00006a58@yahoo.com>
	<20240529004530.00005793@yahoo.com>
	<v35ssd$qq9b$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 29 May 2024 08:21:13 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="5312032778bda400147ee6d9907947d6";
	logging-data="1091258"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19931sfDVbWZfnAmbJCFYYgr2FPZtyv8/I="
Cancel-Lock: sha1:f2+odmNjklZu2rT3LjaVCc9jG7U=
X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32)
Bytes: 3605

On Wed, 29 May 2024 01:29:00 +0100
bart <bc@freeuk.com> wrote:

> On 28/05/2024 22:45, Michael S wrote:
> > On Tue, 28 May 2024 23:23:15 +0300
> > Michael S <already5chosen@yahoo.com> wrote:
> >   
> >>
> >> Also, I think that random numbers are close to worst case for
> >> branch predictor / loop length predictor in my inner loop.
> >> Were I thinking about random case upfront, I'd code an inner loop
> >> differently. I'd always copy 4 octets (comma would be stored in the
> >> same table). After that I would update outptr by length taken from
> >> additional table, similarly, but not identically to your method
> >> below. 
> > 
> > That's what I had in mind:
> >   
> 
> >    unsigned char bin2dec[256][MAX_CHAR_PER_NUM+1]; //
> >    bin2dec[MAX_CHAR_PER_NUM] => length for (int i = 0; i < 256;++i)
> > {  
> 
> Is this a comment that has wrapped?
> 
> After fixing a few such line breaks, this runs at 3.6 seconds
> compared with 4.1 seconds for the original.
> 
> Although I don't quite understand the comments about branch
> prediction.
> 
> I think runtime is still primarily spent in I/O.
> 

That's undoubtedly correct.
But high branch mispredict rate still can add to total time.
Suppose we have branch misprediction at the end of inner loop in 40% of
the input bytes. On the processor that was running my original test
(Intel Haswell at 4GHz) each mispredict cost ~15 clocks = 3.75 ns.
3.75ns * 0.4 * 100M = 150 msec
I don't know how much it costs on your hardware since you didn't tell
me what it is.

But I am more intrigued by slowness on WSL.
Did you compare native vs mounted file systems?

> If I take the 1.9 second version, and remove the fwrite, then it runs
> in 0.8 seconds. 0.7 of that is generating the text (366MB's worth, a
> line at a time).
> 
> In my language that part takes 0.9 seconds, which is a more typical 
> difference due to gcc's superior optimiser.
> 
>