Path: ...!feeds.phibee-telecom.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: James Kuyper Newsgroups: comp.lang.c Subject: Re: relearning C: why does an in-place change to a char* segfault? Date: Fri, 2 Aug 2024 14:19:49 -0400 Organization: A noiseless patient Spider Lines: 62 Message-ID: References: <87jzh0gdru.fsf@nosuchdomain.example.com> <20240801174256.890@kylheku.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Fri, 02 Aug 2024 20:19:52 +0200 (CEST) Injection-Info: dont-email.me; posting-host="fc09beaaa05891c9bcdd65835817b51b"; logging-data="3108891"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18nI3pyBPwydCOUbqQwodMJLny3vFIHzro=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:ngkNS1jLs0Cq2NqmSZ62bhYJAZQ= In-Reply-To: Content-Language: en-US Bytes: 4113 On 8/2/24 5:43 AM, Bart wrote: > On 02/08/2024 02:06, Kaz Kylheku wrote: >> On 2024-08-01, Bart wrote: >>>>> It segfaults when the string is stored in a read-only part of the >>>>> binary. >>>> >>>> A string literal creates an array object with static storage duration. >>>> Any attempt to modify that array object has undefined behavior. >>> >>> What's the difference between such an object, and an array like one of >>> these: >>>    static char A[100]; >>>    static char B[100]={1}; >>> >>> Do these not also have static storage duration? Yet presumably these can >>> be legally modified. >> >> That 1 which initializes B[0] cannot be modified. >> > > Why not? I haven't requested that those are 'const'. ... You don't get a choice in the matter. The C language doesn't permit numeric literals of any kind to be modified by your code. They can't be, and don't need to be, declared 'const'. I've heard that in some other languages, if you call foo(3), and foo() changes the value of it's argument to 2, then subsequent calls to bar(3) will pass a value of 2 to bar(). That sounds like such a ridiculous mis-feature that I hesitate to identify which languages I had heard accused of having that feature, but it is important to note that C is not one of them. Just as 1 is an integer literal whose value cannot be modified, "Hello, world!" is a string literal whose contents cannot be safely modified. The key difference is that, in many context "Hello, world!" gets automatically converted into a pointer to it's first element, a feature that makes it a lot easier to work with string literals - but also opens up the possibility of attempting to write though that pointer. Doing so has undefined behavior, which can include the consequences of storing the contents of string literals in read-only memory. That pointer's value should logically have had the type "const char*", which would have made most attempts to write though that pointer constraint violations, but the language didn't have 'const' at the time that decision was made. In C++ the value is const-qualified. In C, the best you can do is to make sure that if you define a pointer, and initialize that pointer by setting it to point it inside a string literal, you should declare that pointer as "const char*". > ... Further, gcc has no > problem running this program: > >     static char A[100]; >     static char B[100]={1}; > >     printf("%d %d %d\n", A[0], B[0], 1); >     A[0]=55; >     B[0]=89; >     printf("%d %d %d\n", A[0], B[0], 1); Of course, why should it? Neither A nor B are string literals, they are only initialized by copying from a string literal. Since their definitions are not const-qualified, there's no problems with such code.