Deutsch   English   Français   Italiano  
<utpt4f$ha61$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bart <bc@freeuk.com>
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Sun, 24 Mar 2024 18:58:24 +0000
Organization: A noiseless patient Spider
Lines: 98
Message-ID: <utpt4f$ha61$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
 <20240320114218.151@kylheku.com>
 <20240321211306.779b21d126e122556c34a346@gmail.moc>
 <utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
 <utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
 <utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
 <20240324185353.00002395@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 24 Mar 2024 18:58:23 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="08d1ddc3ecb1dc9877ac945c1bf2bccd";
	logging-data="567489"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18Q+OABYSkqwIot7laSdlvV"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:tZsHFTDJJVqO7qI7P+kcTIqw9gs=
In-Reply-To: <20240324185353.00002395@yahoo.com>
Content-Language: en-GB
Bytes: 4769

On 24/03/2024 15:53, Michael S wrote:
> On Sat, 23 Mar 2024 21:21:58 +0000
> bart <bc@freeuk.com> wrote:
> 
>> On 23/03/2024 16:51, David Brown wrote:
>>> On 23/03/2024 12:26, bart wrote:
>>>> On 23/03/2024 07:26, James Kuyper wrote:
>>>>> bart <bc@freeuk.com> writes:
>>>>>> On 22/03/2024 17:14, James Kuyper wrote:
>>>>> [...]
>>>>>>> If you want to tell a system not only what a program must do,
>>>>>>> but also how it must do it, you need to use a lower-level
>>>>>>> language than C.
>>>>>>
>>>>>> Which one?
>>>>>
>>>>> That's up to you. The point is, C is NOT that language.
>>>>
>>>> I'm asking which /mainstream/ HLL is lower level than C. So
>>>> specifically ruling out assembly.
>>>>
>>>> If there is no such choice, then this is the problem: it has to be
>>>> C or nothing.
>>>
>>> How much of a problem is it, really?
>>>
>>> My field is probably the place where low level programming is most
>>> ubiquitous.  There are plenty of people who use assembly - for good
>>> reasons or for bad (or for reasons that some people think are good,
>>> other people think are bad).  C is the most common choice.
>>>
>>> Other languages used for small systems embedded programming include
>>> C++, Ada, Forth, BASIC, Pascal, Lua, and Micropython.  Forth is the
>>> only one that could be argued as lower-level or more "directly
>>> translated" than C.
>>
>> Well, Forth is certainly cruder than C (it's barely a language IMO).
>> But I don't remember seeing anything in it resembling a type system
>> that corresponds to the 'i8-i64 u8-u64 f32-f64' types typical in
>> current hardware. (Imagine trying to create a precisely laid out
>> struct.)
>>
>> It is just too weird. I think I'd rather take my chances with C.
>>
>>   > BASIC, ..., Lua, and Micropython.
>>
>> Hmm, I think my own scripting language is better at low level than
>> any of these. It supports those low-level types for a start. And I
>> can do stuff like this:
>>
>>      println peek(0x40'0000, u16):"m"
>>
>>      fun peek(addr, t=byte) = makeref(addr, t)^
>>
>> This displays 'MZ', the signature of the (low-)loaded EXE image on
>> Windows
>>
>> Possibly it is even better than C; is this little program valid (no
>> UB) C, even when it is known that the program is low-loaded:
>>
>>      #include <stdio.h>
>>      typedef unsigned char byte;
>>
>>      int main(void) {
>>          printf("%c%c\n", *(byte*)0x400000, *(byte*)0x400001);
>>      }
>>
>> This works on DMC, tcc, mcc, lccwin, but not gcc because that loads
>> programs at high addresses. The problem being that the address
>> involved, while belonging to the program, is outside of any C data
>> objects.
>>
>>
> 
> #include <stdio.h>
> #include <stddef.h>
> 
> int main(void)
> {
>    char* p0 = (char*)((size_t)main & -(size_t)0x10000);
>    printf("%c%c\n", p0[0], p0[1]);
>    return 0;
> }
> 
> 
> That would work for small programs. Not necessarily for bigger
> programs.
> 

I'm not sure how that works. Are EXE images always loaded at multiple of 
64KB? I suppose on larger programs it could search backwards 64KB at a 
time (although it could also hit on a rogue 'MZ' in program data).

My point however was whether C considered that p0[0] access UB because 
it doesn't point into any C data object.

If so, it would make access to memory-mapped devices or frame-buffers, 
or implementing things like garbage collectors, problematical.