Deutsch   English   Français   Italiano  
<veh6j8$q71j$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!2.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Terje Mathisen <terje.mathisen@tmsw.no>
Newsgroups: comp.arch
Subject: Re: 80286 protected mode
Date: Sun, 13 Oct 2024 21:21:11 +0200
Organization: A noiseless patient Spider
Lines: 88
Message-ID: <veh6j8$q71j$1@dont-email.me>
References: <2024Oct6.150415@mips.complang.tuwien.ac.at>
 <memo.20241006163428.19028W@jgd.cix.co.uk>
 <2024Oct7.093314@mips.complang.tuwien.ac.at>
 <7c8e5c75ce0f1e7c95ec3ae4bdbc9249@www.novabbs.org>
 <2024Oct8.092821@mips.complang.tuwien.ac.at> <ve5ek3$2jamt$1@dont-email.me>
 <ve6gv4$2o2cj$1@dont-email.me> <ve6olo$2pag3$2@dont-email.me>
 <73e776d6becb377b484c5dcc72b526dc@www.novabbs.org>
 <ve7sco$31tgt$1@dont-email.me>
 <2b31e1343b1f3fadd55ad6b87d879b78@www.novabbs.org>
 <ve99fg$38kta$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Injection-Date: Sun, 13 Oct 2024 21:21:12 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="223c91382642b3c5ff23befc6f6b8bea";
	logging-data="859187"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/j94mCAstKiGoL393f5S3EsH96pL3xkQMXVwDWpdtXkQ=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
 Firefox/91.0 SeaMonkey/2.53.19
Cancel-Lock: sha1:K11lpqMbOSiJCFwNgyq6At5x6k4=
In-Reply-To: <ve99fg$38kta$1@dont-email.me>
Bytes: 4943

David Brown wrote:
> On 10/10/2024 20:38, MitchAlsup1 wrote:
>> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote:
>>
>>> On 09/10/2024 23:37, MitchAlsup1 wrote:
>>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote:
>>>>
>>>>> On 09/10/2024 20:10, Thomas Koenig wrote:
>>>>>> David Brown <david.brown@hesbynett.no> schrieb:
>>>>>>
>>>>>>> When would you ever /need/ to compare pointers to different objec=
ts?
>>>>>>> For almost all C programmers, the answer is "never".
>>>>>>
>>>>>> Sometimes, it is handy to encode certain conditions in pointers,
>>>>>> rather than having only a valid pointer or NULL.=C3=82=C2=A0 A com=
piler,
>>>>>> for example, might want to store the fact that an error occurred
>>>>>> while parsing a subexpression as a special pointer constant.
>>>>>>
>>>>>> Compilers often have the unfair advantage, though, that they can
>>>>>> rely on what application programmers cannot, their implementation
>>>>>> details.=C3=82=C2=A0 (Some do not, such as f2c).
>>>>>
>>>>> Standard library authors have the same superpowers, so that they ca=
n
>>>>> implement an efficient memmove() even though a pure standard C
>>>>> programmer cannot (other than by simply calling the standard librar=
y
>>>>> memmove() function!).
>>>>
>>>> This is more a symptom of bad ISA design/evolution than of libc
>>>> writers needing superpowers.
>>>
>>> No, it is not.=C3=82=C2=A0 It has absolutely /nothing/ to do with the=
 ISA.
>>
>> For example, if ISA contains an MM instruction which is the
>> embodiment of memmove() then absolutely no heroics are needed
>> of desired in the libc call.
>>
>=20
> The existence of a dedicated assembly instruction does not let you writ=
e=20
> an efficient memmove() in standard C.=C2=A0 That's why I said there was=
 no=20
> connection between the two concepts.
>=20
> For some targets, it can be helpful to write memmove() in assembly or=20
> using inline assembly, rather than in non-portable C (which is the=20
> common case).
>=20
>> Thus, it IS a symptom of ISA evolution that one has to rewrite
>> memmove() every time wider SIMD registers are available.
>=20
> It is not that simple.
>=20
> There can often be trade-offs between the speed of memmove() and=20
> memcpy() on large transfers, and the overhead in setting things up that=
=20
> is proportionally more costly for small transfers.=C2=A0 Often that can=
 be=20
> eliminated when the compiler optimises the functions inline - when the =

> compiler knows the size of the move/copy, it can optimise directly.

What you are missing here David is the fact that Mitch's MM is a single=20
instruction which does the entire memmove() operation, and has the=20
inside knowledge about cache (residency at level x? width in=20
bytes)/memory ranges/access rights/etc needed to do so in a very close=20
to optimal manner, for both short and long transfers.

I.e. totally removing the need for compiler tricks or wide register=20
operations.

Also apropos the compiler library issue:

You start by teaching the compiler about the MM instruction, and to=20
recognize common patterns (just as most compilers already do today), and =

then the memmove() calls will usually be inlined.

Terje

--=20
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"