Article <v3pph1$1010d$1@dont-email.me>

Deutsch English Français Italiano
<v3pph1$1010d$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: "Stephen Fuld" <SFuld@alumni.cmu.edu.invalid>
Newsgroups: comp.arch
Subject: Re: Byte Addressability And Beyond
Date: Wed, 5 Jun 2024 13:34:25 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 85
Message-ID: <v3pph1$1010d$1@dont-email.me>
References: <v0s17o$2okf4$2@dont-email.me> <v327n3$1use$1@gal.iecc.com> <BM25O.40665$HBac.4762@fx15.iad> <v32lpv$1u25$1@gal.iecc.com> <v33bqg$9cst$11@dont-email.me> <v34v62$ln01$1@dont-email.me> <v36bva$10k3v$2@dont-email.me> <2024May29.090435@mips.complang.tuwien.ac.at> <v38opv$1gsj2$3@dont-email.me> <v38rkd$1ha8a$1@dont-email.me> <jwvttifrysb.fsf-monnier+comp.arch@gnu.org> <f90b6e03c727b0f209d64484ec097298@www.novabbs.org> <v3jtd8$3qduu$2@dont-email.me> <20240603132227.00004e0f@yahoo.com> <k6k7O.8602$7jpd.5620@fx47.iad> <v3klhp$3ugeh$1@dont-email.me> <v3mljt$c63k$1@dont-email.me> <v3nf7j$gesm$1@dont-email.me> <v3padn$the9$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 05 Jun 2024 15:34:25 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="61d35cd0bc070ff15be1578b8eb57b77";
	logging-data="1049613"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/MlKP0t5QFTGBEXDMz3wPTsMyJIc3GkVI="
User-Agent: XanaNews/1.21-f3fb89f (x86; Portable ISpell)
Cancel-Lock: sha1:bkF2Byhnx3TF1ApK+iqQaQhOtEk=
Bytes: 5154

Terje Mathisen wrote:

> Stephen Fuld wrote:
> > Terje Mathisen wrote:
> > 
> > > Stephen Fuld wrote:
> > > > Scott Lurndal wrote:
> > > > 
> >>>>Michael S <already5chosen@yahoo.com> writes:
> > > > > > On Mon, 3 Jun 2024 08:03:53 -0000 (UTC)
> >>>>>Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
> > > > > > 
> > > > > > > On Thu, 30 May 2024 18:31:46 +0000, MitchAlsup1 wrote:
> > > > > > > =20
> > > > > > > > 30 years ago you could say the same thing about
> > > > > > > > encryption.  =20
> > > > > > > =20
> > > > > > > I don=E2=80=99t think newer CPUs have been optimized for
> > > > > encryption. Inst=
> > > > > > ead,
> > > > > > > we see newer encryption algorithms (or ways of using them)
> > > > > > > that
> >>>>work >> better on current CPUs.=20
> > > > > > 
> > > > > > I think moderate efficiency on CPU, not too low, but not
> > > > > > high either, is a requirement for (symmetric-key) cipher.
> > > > > > Esp. when the key is 128-bit or shorter.
> > > > > 
> > > > > Most modern CPUs have instruction set support for symmetric
> > > > > ciphers such as AES, SM2/SM3 as well as message digest/hash
> > > > > (SHA1, SHA256 et al).
> > > > > 
> > > > > High throughput encryption has been done by hardware
> > > > > accelerators for decades now (e.g. bbn or ncypher HSM boxes
> > > > > sitting on a SCSI bus; now such HSM are an integral part of
> > > > > many SoC).
> > > > 
> > > > 
> > > > Queston.  For a modern general purpose CPU, if you are
> > > > including all the logic to implement encryption instructions,
> > > > is it much more to include the control/sequencing logic to do
> > > > it and not tie up the rest of the CPU logic to do the
> > > > encryption?  Furthermore, an "inbuilt" accelerator could
> > > > interface directly with the I/O hardware of the CPU (e.g. PCI),
> > > > saving the "intermediate" step of writing the encrypted data to
> > > > memory.
> > > 
> > > That logic already exists, in the form of a single thread/core
> > > dedicated to the job.
> > > 
> > > With 30-100 cores on a single die, it becomes very cheap to
> > > dedicate one of them to babysit such a process, compared to the
> > > cost of making a custom chunk of VLSI to do the same. This is
> > > particularly true because the logic needed in the babysitting
> > > process is mostly straight line, with a very limited number of
> > > hard-to-predict branches.
> > > 
> > > I.e. h.264 CABAC decoding has three branches per bit decoded, at
> > > least one of them impossible to predict or work around with clever
> > > coding. Here it makes perfect sense to have a chunk of hw to
> > > handle the heavy lifting. Monitoring block encryption/decryption
> > > not so much.
> > 
> > 
> > I may be missing something, but while your proposal addresses the
> > first part of my proposal, I think it doesn't adress the second.
> > That is, for data coming from/going to some external source, you
> > are still doing "unnecessary" memory traffic, which takes memory
> > bandwidth and increases latency.
> 
> Usually, when a CPU needs to work on something, it will need to get
> the data into $L1 anyway? It is only when the work is simply to be a
> pipeline that having a way to bypass the CPU completely really makes
> a difference, right?

Right.  But my point is that the CPU never really need to "work" on the
encrypted data.  It it frequently only sent to, or received from the
network or a storage device, hence the pipelined approach has
advantages.



-- 
 - Stephen Fuld 
(e-mail address disguised to prevent spam)