| Deutsch English Français Italiano |
|
<97ead3323da5a89d174910edebcbb815@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Misc: Ongoing status...
Date: Sun, 2 Feb 2025 01:22:47 +0000
Organization: Rocksolid Light
Message-ID: <97ead3323da5a89d174910edebcbb815@www.novabbs.org>
References: <vnglop$33lk0$1@dont-email.me> <cda6055929f89df81fb056509038afed@www.novabbs.org> <vnhrrj$3d7i0$1@dont-email.me> <c7cd7a9e4a8c14dab63f0b9394af4677@www.novabbs.org> <vnjv02$3p9g3$1@dont-email.me> <605af5c97a4635c48fe002dbc78a5686@www.novabbs.org> <vnm813$agrn$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2261456"; mail-complaints-to="usenet@i2pn2.org";
posting-account="o5SwNDfMfYu6Mv4wwLiW6e/jbA93UAdzFodw5PEa6eU";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Site: $2y$10$oYNbzSQTLaFfdwoD5HDMp.kKBhmXTLHZz9h2fmSB1nJEFzQ43rPQu
X-Rslight-Posting-User: cb29269328a20fe5719ed6a1c397e21f651bda71
Bytes: 2527
Lines: 33
On Sat, 1 Feb 2025 22:42:39 +0000, BGB wrote:
> On 1/31/2025 10:05 PM, MitchAlsup1 wrote:
--------------------------------
> Whereas, if performance is dominated by a piece of code that looks like,
> say:
> v0=dytf_int2fixnum(123);
> v1=dytf_int2fixnum(456);
> v2=dytf_mul(v0, v1);
> v3=dytf_int2fixnum(789);
> v4=dytf_add(v2, v3);
> v5=dytf_wrapsymbol("x");
> dytf_storeindex(obj, v5, v4);
> ...
> With, say, N levels of call-graph in each called function, but with this
> sort of code still managing to dominate the total CPU ("Self%" time).
>
> This seems to be a situation where callee-save registers are a big win
> for performance IME.
With callee save registers, the prologue and epilogue of subroutines
sees all the save/restore memory traffic; sometimes saving a register
that is not "in use" and restoring it later.
With caller save registers, the caller saves exactly the registers
it needs preserved, while the callee saves/restores none. Moreover
it only saves registers currently "in use" and may defer restoring
since it does not need that value in that register for a while.
So, the instruction path length has a better story in caller saves
than callee saves. Nothing that was "Not live" is ever saved or
restored.
The arguments for callee save have to do with I cache footprint.