Deutsch   English   Français   Italiano  
<v8gs06$2ceis$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bart <bc@freeuk.com>
Newsgroups: comp.lang.c
Subject: Re: relearning C: why does an in-place change to a char* segfault?
Date: Thu, 1 Aug 2024 21:42:48 +0100
Organization: A noiseless patient Spider
Lines: 57
Message-ID: <v8gs06$2ceis$1@dont-email.me>
References: <IoGcndcJ1Zm83zb7nZ2dnZfqnPWdnZ2d@brightview.co.uk>
 <20240801114615.906@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 01 Aug 2024 22:42:47 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="1e1bbde09af84609d5b2b26e972c5153";
	logging-data="2505308"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18NYJT6FM3XXzXCC96WUQKI"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:PRxJGezDEi3iEvXs9SuHWa2oZUY=
Content-Language: en-GB
In-Reply-To: <20240801114615.906@kylheku.com>
Bytes: 3111

On 01/08/2024 20:39, Kaz Kylheku wrote:
> On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
>> This program segfaults at the commented line:
>>
>> #include <ctype.h>
>> #include <stdio.h>
>>
>> void uppercase_ascii(char *s) {
>>      while (*s) {
>>          *s = toupper(*s); // SEGFAULT
>>          s++;
>>      }
>> }
>>
>> int main() {
>>      char* text = "this is a test";
> 
> The "this is a test" object is a literal. It is part of the program's image.

So is the text here:

   char text[]="this is a test";

But this can be changed without making the program self-modifying.

I guess it depends on what is classed as the program's 'image'.

I'd say the image in the state it is in just after loading or just 
before execution starts (since certain fixups are needed). But some 
sections will be writable during execution, some not.

> When you try to change it, you're making your program self-modifying.

>> Program received signal SIGSEGV, Segmentation fault.
>> 0x000055555555516e in uppercase_ascii (s=0x555555556004 "this is a test")
>> at inplace.c:6
>> 6	        *s = toupper(*s);
> 
> On Linux, the string literals of a C executable are located together
> with the program text. They are interspersed among the machine
> instructions which reference them. The program text is mapped
> read-only, so an attempted modification is an access violation trapped
> by the OS, turned into a SIGSEGV signal.

Does it really do that? That's the method I've used for read-only 
strings, to put them into the code-segment (since I neglected to support 
a dedicated read-only data section, and it's too much work now).

But I don't like it since the code section is also executable; you could 
inadvertently execute code within a string (which might happen to 
contain machine code for other purposes).

The dangers are small, but there must be reasons why a dedication 
section is normally used. gcc on Windows creates up to 19 sections, so 
it would odd for literal strings to share with code.