Deutsch   English   Français   Italiano  
<2d2954cbc50f3f59572833b742091e478c2dda49@i2pn2.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: Richard Damon <richard@damon-family.org>
Newsgroups: comp.lang.c
Subject: Re: relearning C: why does an in-place change to a char* segfault?
Date: Fri, 2 Aug 2024 11:03:13 -0400
Organization: i2pn2 (i2pn.org)
Message-ID: <2d2954cbc50f3f59572833b742091e478c2dda49@i2pn2.org>
References: <IoGcndcJ1Zm83zb7nZ2dnZfqnPWdnZ2d@brightview.co.uk>
 <v8fhhl$232oi$1@dont-email.me> <v8fn2u$243nb$1@dont-email.me>
 <87jzh0gdru.fsf@nosuchdomain.example.com> <v8gte2$2ceis$2@dont-email.me>
 <20240801174256.890@kylheku.com> <v8i9o8$2oof8$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 2 Aug 2024 15:03:13 -0000 (UTC)
Injection-Info: i2pn2.org;
	logging-data="1215790"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="diqKR1lalukngNWEqoq9/uFtbkm5U+w3w6FQ0yesrXg";
User-Agent: Mozilla Thunderbird
In-Reply-To: <v8i9o8$2oof8$1@dont-email.me>
X-Spam-Checker-Version: SpamAssassin 4.0.0
Content-Language: en-US
Bytes: 4804
Lines: 90

On 8/2/24 5:43 AM, Bart wrote:
> On 02/08/2024 02:06, Kaz Kylheku wrote:
>> On 2024-08-01, Bart <bc@freeuk.com> wrote:
>>>>> It segfaults when the string is stored in a read-only part of the 
>>>>> binary.
>>>>
>>>> A string literal creates an array object with static storage duration.
>>>> Any attempt to modify that array object has undefined behavior.
>>>
>>> What's the difference between such an object, and an array like one of
>>> these:
>>
>> Programming languages can have objects that have the same lifetime, 
>> yet some
>> of which are mutable and some of which are immutable.
>>
>> If the compiler believes that the immutable objects are in fact
>> not mutated, it's a bad idea to modify them behind the compiler's
>> back.
>>
>> There doesn't have to be any actual difference in the implementation of
>> these objects, like in what area they are stored, other than the rules
>> regarding their correct use, namely prohibiting modification.
>>
>> The Racket language has both mutable and immutable cons cells.
>> The difference is that the immutable cons cells simply lack the
>> operations needed to mutate them. I'm not an expert on the Racket
>> internals but I don't see a reason why they couldn't be stored in the
>> same heap.
>>
>>>    static char A[100];
>>>    static char B[100]={1};
>>>
>>> Do these not also have static storage duration? Yet presumably these can
>>> be legally modified.
>>
>> That 1 which initializes B[0] cannot be modified.
>>
> 
> Why not? I haven't requested that those are 'const'. Further, gcc has no 
> problem running this program:
> 
>      static char A[100];
>      static char B[100]={1};
> 
>      printf("%d %d %d\n", A[0], B[0], 1);
>      A[0]=55;
>      B[0]=89;
>      printf("%d %d %d\n", A[0], B[0], 1);
> 
> But it does use readonly memory for string literals.
> 
> (The point of A and B was to represent .bss and .data segments 
> respectively. A's data is not part of the EXE image; B's is.
> 
> While the point of 'static' was to avoid having to specify whether A and 
> B were at module scope or within a function.)
> 
>  > That 1 which initializes B[0] cannot be modified.
> 
> Or do you literally mean the value of that '1'? Then it doesn' make 
> sense; here that is a copy of the literal stored in one cell of 'B'. The 
> value of the cell can change, then that particular copy of '1' is lost.
> 
> Here:
> 
>      static char B[100] = {1, 1, 1, 1, 1, 1};
> 
> changing B[0] will not affect the 1s in B[1..5], and in my example 
> above, that standalone '1' is not affected.
> 
> 

The key point is that the {1} isn't the value loclated in B[0], but the 
source of that value when B was initialize, which if B is in the .data 
segement is the source of the data to initialize that .data segement, 
which might exist nowhere in the actual ram memory of the machine, but 
might exist just in the file that was loaded.

WHen accessing the value of a string literal, the compiler needs to do 
something so value is accessible, perhaps by creating a const object 
created like any other const object, and exposing that.

The confusing part is that while it creates a "const char[]" object, the 
type of that object when refered to in code is just "char[]", the 
difference imposed to avoid breaking most code that used strings when 
the standard just was coming out.

Most implementations have an option to at least give a warning if used 
in a way that the const is lost, and most programs today should be 
compiled using that option.