Deutsch   English   Français   Italiano  
<87jzcwqysb.fsf@doppelsaurus.mobileactivedefense.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder8.news.weretis.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: Rainer Weikusat <rweikusat@talktalk.net>
Newsgroups: comp.unix.programmer
Subject: Re: Command Languages Versus Programming Languages
Date: Thu, 21 Nov 2024 17:31:16 +0000
Lines: 53
Message-ID: <87jzcwqysb.fsf@doppelsaurus.mobileactivedefense.com>
References: <uu54la$3su5b$6@dont-email.me> <uusur7$2hm6p$1@dont-email.me>
	<vdf096$2c9hb$8@dont-email.me>
	<87a5fdj7f2.fsf@doppelsaurus.mobileactivedefense.com>
	<ve83q2$33dfe$1@dont-email.me> <vgsbrv$sko5$1@dont-email.me>
	<vgtslt$16754$1@dont-email.me> <86frnmmxp7.fsf@red.stonehenge.com>
	<vhk65t$o5i$1@dont-email.me> <vhkev7$29sc$1@dont-email.me>
	<vhkh94$2oi3$1@dont-email.me> <vhkvpi$5h8v$1@dont-email.me>
	<875xohbxre.fsf@doppelsaurus.mobileactivedefense.com>
	<slrnvjulf2.rfrc.mas@a4.home> <QVI%O.179460$Oi5e.162642@fx15.iad>
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: individual.net Qam9KR1PyM9GFTzwL3XQQQhkBPKzjv0vP6Bb1uLEVZ2BixNMs=
Cancel-Lock: sha1:lie6gqVGxjz+dNK/Ni9uKHZSS9Y= sha1:c1Z4AyIO4khKwvGMc3ZKLws5d9E= sha256:TXrLSJW2Y3pLIbGO5ej00NUwfMoIVgOE887/0xPf7sI=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
Bytes: 3003

scott@slp53.sl.home (Scott Lurndal) writes:
> mas@a4.home writes:
>>On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:
>>> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
>>>
>>> Assuming that p is a pointer to the current position in a string, e is a
>>> pointer to the end of it (ie, point just past the last byte) and -
>>> that's important - both are pointers to unsigned quantities, the 'bulky'
>>> C equivalent of [0-9]+ is
>>>
>>> while (p < e && *p - '0' < 10) ++p;
>>>
>>> That's not too bad. And it's really a hell lot faster than a
>>> general-purpose automaton programmed to recognize the same pattern
>>> (which might not matter most of the time, but sometimes, it does).
>>
>>int
>>main(int argc, char **argv) {
>>unsigned char *p, *e;
>>unsigned char mystr[] = "12#45XY ";
>>
>>    p = mystr;
>>    e = mystr + sizeof(mystr);
>>    while (p < e && *p - '0' < 10) ++p;
>>
>>    size_t xlen = p-mystr;
>>    printf("digits: '%.*s'\n", (int) xlen, mystr);
>>    printf("mystr %p p %p e %p\n", mystr, p, e);
>>    printf("xlen %zd\n", xlen);
>>
>>    return 0;
>>}
>>
>> ./a.out
>>digits: '12#45'
>>mystr 0x7ffc92ac55ff p 0x7ffc92ac5604 e 0x7ffc92ac5608
>>xlen 5
>>
>
> Indeed.    Rainer's while loop should be using isdigit(*p).

I'm even using

    c &= ~0x20;
    if (c - 'A' < 6) return c - 'A' + 10;

for detecting hex digits (type of c is unsigned). JSON demands UTF8
which implies demanding ASCII (which means that the code point of a
lowercase letter is that of the corresponding uppercase letter but with
the sixth bit additionally set).

Constructs like this are great insurance policy against people inventing
spurious character encodings.