Path: ...!feed.opticnetworks.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch
Newsgroups: comp.lang.c
Subject: Re: "undefined behavior"?
Date: Sat, 22 Jun 2024 09:28:14 -0700
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <86cyo9gcwx.fsf@linuxsc.com>
References: <666a095a$0$952$882e4bbb@reader.netnews.com> <877cet7qkl.fsf@nosuchdomain.example.com> <86frt9ixw2.fsf@linuxsc.com> <87y171yd87.fsf@nosuchdomain.example.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Sat, 22 Jun 2024 18:28:16 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="ec2cba2faddfc9625d8883c2f98b569e";
logging-data="4039421"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18/JH8tKA0eTJ9oXkrvo6AxKDH6OjaBmQk="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:I3WEBbAMfUBvk6sx4vYTamjmX4w=
sha1:9wnMNmuOzBKed6NZ5Jll0Ty8ODc=
Bytes: 2699
Keith Thompson writes:
> Tim Rentsch writes:
>
>> Keith Thompson writes:
>>
>>> (I'd like to a future standard require plain char to be unsigned,
>>> but I don't know how likely that is.)
>>
>> It seems unnecessary given that the upcoming C standard
>> is choosing to mandate two's complement for all signed
>> integer types.
>
> It's less necessary, but I'd still like to see it.
>
> These days, strings very commonly hold UTF-8 data. The fact that bytes
> whose values exceed 127 are negative is conceptually awkward, even if
> everything happens to work. It rarely if ever makes sense to treat a
> character value as negative.
The combination of mandating two's complement and using a compiler
option like -funsigned-char (supported by both gcc and clang)
should be enough to do what you want.
> (And of course signed char still exists,
> or int8_t if you prefer 8 bits vs. CHAR_BIT bits.)
It makes me laugh when people use int8_t instead of signed char.
If CHAR_BIT isn't 8 then there won't be any int8_t. And of
course we can always throw in a static assertion if it is felt
necessary to protect against implementations that don't have
8-bit chars. (A static assertion also can verify that two's
complement is being used for signed char.)
> A drawback is that it could break existing (non-portable) code that
> assumes plain char is signed.
Exactly! No reason to break the whole world when you can get
what you want just by using a compiler option.