Deutsch English Français Italiano |
<v8icrj$2paum$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Bart <bc@freeuk.com> Newsgroups: comp.lang.c Subject: Re: relearning C: why does an in-place change to a char* segfault? Date: Fri, 2 Aug 2024 11:36:36 +0100 Organization: A noiseless patient Spider Lines: 68 Message-ID: <v8icrj$2paum$1@dont-email.me> References: <IoGcndcJ1Zm83zb7nZ2dnZfqnPWdnZ2d@brightview.co.uk> <20240801114615.906@kylheku.com> <v8gs06$2ceis$1@dont-email.me> <20240801172148.200@kylheku.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Fri, 02 Aug 2024 12:36:36 +0200 (CEST) Injection-Info: dont-email.me; posting-host="685cfd46d568320119a7f0cfbd8429d5"; logging-data="2927574"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX198odbOVD0HyjqbInxeBRpn" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:N2R8yqFqvGMzsvU5O9qUiQ6x7IM= In-Reply-To: <20240801172148.200@kylheku.com> Content-Language: en-GB Bytes: 3700 On 02/08/2024 01:37, Kaz Kylheku wrote: > On 2024-08-01, Bart <bc@freeuk.com> wrote: >> On 01/08/2024 20:39, Kaz Kylheku wrote: >>> On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote: >>>> This program segfaults at the commented line: >>>> >>>> #include <ctype.h> >>>> #include <stdio.h> >>>> >>>> void uppercase_ascii(char *s) { >>>> while (*s) { >>>> *s = toupper(*s); // SEGFAULT >>>> s++; >>>> } >>>> } >>>> >>>> int main() { >>>> char* text = "this is a test"; >>> >>> The "this is a test" object is a literal. It is part of the program's image. >> >> So is the text here: >> >> char text[]="this is a test"; >> >> But this can be changed without making the program self-modifying. > > The array which is initialized by the literal is what can be > changed. > > In this situation, the literal is just initializer syntax, > not required to be an object with an address. I don't spot the 'int main() {' part of your example; my version of it was meant to be static. (My A, B examples explicitly used 'static'.) >> I guess it depends on what is classed as the program's 'image'. >> >> I'd say the image in the state it is in just after loading or just >> before execution starts (since certain fixups are needed). But some >> sections will be writable during execution, some not. > > Programs can self-modify in ways designed into the run time. > The toaster has certain internal receptacles that can take > certain forks, according to some rules, which do not affect > the user operating the toaster according to the manual. > >> The dangers are small, but there must be reasons why a dedication >> section is normally used. gcc on Windows creates up to 19 sections, so >> it would odd for literal strings to share with code. > > One reason is that PC-relative addressing can be used by code to > find its literals. Since that usually has a limited range, it helps > to keep the literals with the code. Combining sections also reduces > size. The addressing is also relocatable, which is useful in shared > libs. You must be talking about ARM then, with its limited address displacement (I think 12 bits or +/- 2KB). On x64, PC-relative uses a 32-bit offset so the range is +/- 2GB; enough to have string literals located in their own read-only section of memory. I'm sure you can do that on ARM too, I can think of several ways (and there are loads more registers to play with keep as bases to tables of such data). But I don't know what real code does.