Deutsch   English   Français   Italiano  
<20240801172148.200@kylheku.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!npeer.as286.net!npeer-ng0.as286.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Kaz Kylheku <643-408-1753@kylheku.com>
Newsgroups: comp.lang.c
Subject: Re: relearning C: why does an in-place change to a char* segfault?
Date: Fri, 2 Aug 2024 00:37:44 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 64
Message-ID: <20240801172148.200@kylheku.com>
References: <IoGcndcJ1Zm83zb7nZ2dnZfqnPWdnZ2d@brightview.co.uk>
 <20240801114615.906@kylheku.com> <v8gs06$2ceis$1@dont-email.me>
Injection-Date: Fri, 02 Aug 2024 02:37:44 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="06869e1ea80aff81fac5e88c79ec8ec1";
	logging-data="2588217"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+rXjXwmnPTu1eMNoZ9h32JVPFGo9lGkoU="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:ysEG7YUUQL8zpzH9SevPaN+6prs=
Bytes: 3263

On 2024-08-01, Bart <bc@freeuk.com> wrote:
> On 01/08/2024 20:39, Kaz Kylheku wrote:
>> On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
>>> This program segfaults at the commented line:
>>>
>>> #include <ctype.h>
>>> #include <stdio.h>
>>>
>>> void uppercase_ascii(char *s) {
>>>      while (*s) {
>>>          *s = toupper(*s); // SEGFAULT
>>>          s++;
>>>      }
>>> }
>>>
>>> int main() {
>>>      char* text = "this is a test";
>> 
>> The "this is a test" object is a literal. It is part of the program's image.
>
> So is the text here:
>
>    char text[]="this is a test";
>
> But this can be changed without making the program self-modifying.

The array which is initialized by the literal is what can be
changed.

In this situation, the literal is just initializer syntax,
not required to be an object with an address.

But there could well be such an object in the program image,
especially if the array is automatic, and thus instantiated
many times. 

If the program tries to search for that object and modify it,
it will run into UB.

> I guess it depends on what is classed as the program's 'image'.
>
> I'd say the image in the state it is in just after loading or just 
> before execution starts (since certain fixups are needed). But some 
> sections will be writable during execution, some not.

Programs can self-modify in ways designed into the run time.
The toaster has certain internal receptacles that can take
certain forks, according to some rules, which do not affect
the user operating the toaster according to the manual.

> The dangers are small, but there must be reasons why a dedication 
> section is normally used. gcc on Windows creates up to 19 sections, so 
> it would odd for literal strings to share with code.

One reason is that PC-relative addressing can be used by code to
find its literals. Since that usually has a limited range, it helps
to keep the literals with the code. Combining sections also reduces
size. The addressing is also relocatable, which is useful in shared
libs.

-- 
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca