Deutsch English Français Italiano |
<wwva5jj4zsw.fsf@LkoBDZeT.terraraq.uk> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!weretis.net!feeder8.news.weretis.net!usenet.goja.nl.eu.org!nntp.terraraq.uk!.POSTED.tunnel.sfere.anjou.terraraq.org.uk!not-for-mail From: Richard Kettlewell <invalid@invalid.invalid> Newsgroups: comp.lang.c Subject: Re: Hex string literals (was Re: C23 thoughts and opinions) Date: Mon, 17 Jun 2024 11:41:03 +0100 Organization: terraraq NNTP server Message-ID: <wwva5jj4zsw.fsf@LkoBDZeT.terraraq.uk> References: <v2l828$18v7f$1@dont-email.me> <00297443-2fee-48d4-81a0-9ff6ae6481e4@gmail.com> <v2lji1$1bbcp$1@dont-email.me> <87msoh5uh6.fsf@nosuchdomain.example.com> <f08d2c9f-5c2e-495d-b0bd-3f71bd301432@gmail.com> <v2nbp4$1o9h6$1@dont-email.me> <v2ng4n$1p3o2$1@dont-email.me> <87y18047jk.fsf@nosuchdomain.example.com> <87msoe1xxo.fsf@nosuchdomain.example.com> <v2sh19$2rle2$2@dont-email.me> <87ikz11osy.fsf@nosuchdomain.example.com> <v2v59g$3cr0f$1@dont-email.me> <87plt8yxgn.fsf@nosuchdomain.example.com> <v31rj5$o20$1@dont-email.me> <87cyp6zsen.fsf@nosuchdomain.example.com> <v34gi3$j385$1@dont-email.me> <874jahznzt.fsf@nosuchdomain.example.com> <v36nf9$12bei$1@dont-email.me> <87v82b43h6.fsf@nosuchdomain.example.com> <87iky830v7.fsf_-_@nosuchdomain.example.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: innmantic.terraraq.uk; posting-host="tunnel.sfere.anjou.terraraq.org.uk:172.17.207.6"; logging-data="10543"; mail-complaints-to="usenet@innmantic.terraraq.uk" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) Cancel-Lock: sha1:EMxhFJ1x/tV4EH5e1I4fJqg4Odg= X-Face: h[Hh-7npe<<b4/eW[]sat,I3O`t8A`(ej.H!F4\8|;ih)`7{@:A~/j1}gTt4e7-n*F?.Rl^ F<\{jehn7.KrO{!7=:(@J~]<.[{>v9!1<qZY,{EJxg6?Er4Y7Ng2\Ft>Z&W?r\c.!4DXH5PWpga"ha +r0NzP?vnz:e/knOY)PI- X-Boydie: NO Bytes: 4294 Lines: 49 Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: > Inspired by the existing syntax for integer and floating-point > hex constants, I propose using a "0x" prefix. 0x"deadbeef" is an > expression of type `const unsigned char[4]` (assuming CHAR_BIT==8), > with values 0xde, 0xad, 0xbe, 0xef in that order. Byte order is > irrelevant; we're specifying byte values in order, not bytes of > the representation of some larger type. memcpy()ing 0x"deadbeef" > to a uint32 might yield either 0xdeadbeef or uxefbeadde (or other > more exotic possibilities). I like the syntax and I’d find it useful. There’s more to life than byte arrays, though, so I wonder if there’s more to be said here. I find myself dealing a lot with large integers generally represented as arrays of some unsigned type (commonly uint32_t but other possibilities arise too). In C as it stands today this requires a translation step before constants can be embedded in source code (which is error-prone if someone attempts to do it manually). So being able to say ‘0x8732456872648956348596893765836543 as array of uint64_t, LSW first’ (in some suitably C-like syntax) would be a big improvement from my perspective, primarily as an accelerator to development but also as a small improvement in robustness. > Again, unlike other string literals, there is no implicit terminating > null byte. And I suggest making them const, since there's no > existing code to break. > > If CHAR_BIT==8, each byte is represented by two hex digits. More > generally, each byte is represented by (CHAR_BIT+3)/4 hex digits in > the absence of whitespace. Added whitespace marks the end of a byte, > 0x"deadbeef" is 1, 2, 3, or 4 bytes if CHAR_BIT is 32, 16, 12, or 8 > respectively, but 0x"de ad be ef" is 4 bytes regardless of CHAR_BIT. > 0x"" is a syntax error, since C doesn't support zero-length arrays. > Anything between the quotes other than hex digits and spaces is a > syntax error. Would "0x1 23 45 67" be a syntax error or { 0x1, 0x23, 0x45, 0x67 }? > What I'm trying to design here is a more straightforward way to > represent raw (unsigned char[]) data in C code, largely but not > exclusively for use by #embed. Compilers can already implement #embed however they like, there’s no need for a standardized way to represent the ‘inside’ of a #embed. -- https://www.greenend.org.uk/rjk/