Deutsch English Français Italiano |
<8822f5fba34a5dba136e86594fa961a0@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Thu, 12 Sep 2024 21:52:32 +0000 Organization: Rocksolid Light Message-ID: <8822f5fba34a5dba136e86594fa961a0@www.novabbs.org> References: <vaqgtl$3526$1@dont-email.me> <memo.20240830090549.19028u@jgd.cix.co.uk> <2024Aug30.161204@mips.complang.tuwien.ac.at> <86r09ulqyp.fsf@linuxsc.com> <2024Sep8.173639@mips.complang.tuwien.ac.at> <p1cvdjpqjg65e6e3rtt4ua6hgm79cdfm2n@4ax.com> <2024Sep10.101932@mips.complang.tuwien.ac.at> <ygn8qvztf16.fsf@y.z> <2024Sep11.123824@mips.complang.tuwien.ac.at> <vbsoro$3ol1a$1@dont-email.me> <vbut86$9toi$1@dont-email.me> <vbvljl$ea0m$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="1832229"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Rslight-Site: $2y$10$.kHF78UuG.AUa/oUf8ua.ehIkd4xDPVxryK7ciI/F/8TxJuRC2J5K X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 2357 Lines: 43 On Thu, 12 Sep 2024 21:14:18 +0000, BGB wrote: > > This is because in some cases, the performance overhead of copying the > last (sz&31) bytes is significant, say: > rsz=cte-ct; > if(rsz) > { > if(rsz&16) > { > v0=((u64 *)cs)[0]; v1=((u64 *)cs)[1]; > ((u64 *)ct)[0]=v0; ((u64 *)ct)[1]=v1; > cs+=16; ct+=16; > } > if(rsz&8) > { > v0=((u64 *)cs)[0]; > ((u64 *)ct)[0]=v0; > cs+=8; ct+=8; > } > if(rsz&4) > { > v0=((u32 *)cs)[0]; > ((u32 *)ct)[0]=v0; > cs+=4; ct+=4; > } > if(rsz&2) > { > v0=((u16 *)cs)[0]; > ((u16 *)ct)[0]=v0; > cs+=2; ct+=2; > } > if(rsz&1) > { > v0=((byte *)cs)[0]; > ((byte *)ct)[0]=v0; > cs++; ct++; > } > } > > For small copies with awkward sizes, this tailing part can cost more > than the whole rest of the copy. A fine rendition of why this should be in HW as an instruction.