Deutsch   English   Français   Italiano  
<v3l3mk$sns$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!feeds.phibee-telecom.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: "Stephen Fuld" <SFuld@alumni.cmu.edu.invalid>
Newsgroups: comp.arch
Subject: Re: Byte Addressability And Beyond
Date: Mon, 3 Jun 2024 18:57:24 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 118
Message-ID: <v3l3mk$sns$1@dont-email.me>
References: <v0s17o$2okf4$2@dont-email.me> <v34v62$ln01$1@dont-email.me> <v36bva$10k3v$2@dont-email.me> <2024May29.090435@mips.complang.tuwien.ac.at> <v38opv$1gsj2$3@dont-email.me> <v38rkd$1ha8a$1@dont-email.me> <jwvttifrysb.fsf-monnier+comp.arch@gnu.org> <f90b6e03c727b0f209d64484ec097298@www.novabbs.org> <v3jtd8$3qduu$2@dont-email.me> <20240603132227.00004e0f@yahoo.com> <k6k7O.8602$7jpd.5620@fx47.iad> <v3klhp$3ugeh$1@dont-email.me> <wnl7O.10195$Inzb.2858@fx13.iad> <v3ktnl$3vv86$1@dont-email.me> <K2n7O.29169$61Y8.18080@fx15.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 03 Jun 2024 20:57:25 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="9d4b0d0d2bbc2f77954ebbfb6ef5a061";
	logging-data="29436"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/a99K3i0mCzdFI8zS5BsvyOWHWQCu/V8k="
User-Agent: XanaNews/1.21-f3fb89f (x86; Portable ISpell)
Cancel-Lock: sha1:jPsUfLqQxKnIOqO3jqtQL1etig4=
Bytes: 5954

Scott Lurndal wrote:

> "Stephen Fuld" <SFuld@alumni.cmu.edu.invalid> writes:
> > Scott Lurndal wrote:
> > 
> 
> >> > 
> >> > Queston.  For a modern general purpose CPU, if you are including
> all >> > the logic to implement encryption instructions, is it much
> more to >> > include the control/sequencing logic to do it and not
> tie up the >> > rest of the CPU logic to do the encryption?
> Furthermore, an >> > "inbuilt" accelerator could interface directly
> with the I/O >> > hardware of the CPU (e.g. PCI), saving the
> "intermediate" step of >> > writing the encrypted data to memory.
> >> 
> >> There are always tradeoffs.  The issues surrounding the
> >> control/sequencing logic outside of the instruction flow
> >> require some level of asynchronicity, so to avoid bottlenecks
> >> one might need to replicate the "inbuilt accelerator" if
> >> more than one core will be using encryption (e.g. for RSS
> >> with IPSEC flows).
> > 
> > 
> > Yes, but putting the instructions into the core means you are
> > replicating the logic for every core. 
> 
> In the scale of a modern CPU, it's a small fraction of the logic.
> 
> The ARM neoverse cores, for example, require very little area.

Agreed.  I was assuming that the cost of the logic was about the same
whether it was done as CPU instructions or a chunk of accelerator logic
in the I/O stream.  If that is true, then the cost of having multiples
of them in the I/O stream is small.



> 
> >> From the operating software standpoint, it becomes most
> >> convenient, then, to model the offload as a device which
> >> requires OS support (and intervention for e.g. interrupt
> >> handling).
> > 
> > 
> > I look at it differently (and perhaps incorrectly).  I view
> > encryption as one of several "transformations" that data goes
> > through in its path to/from some external device.
> 
> That's certainly a valid view, if perhaps not complete.   There are
> use cases for in-place encryption.

Good.  Can you give some examples, and perhaps an estimate of what
percentage of the total encryption operations are in place?  Note that
it may be possible to add a feature to the "in-stream" hardware to
allow in-place encryption - i.e. both sides go to/come from memory.



> Adding encryption (which of the dozen standard symmetric and
> asymmetric cipher algoritnms?) to a hardware device does increase
> complexity, and thus cost at the expense of extensibility (new
> algorithms come along periodically).

Agreed.  But this is also true for new CPU instructions.


>  The cost of verifying crypto is
> a bit higher as it is very important to get correct when baking into
> gates.


Sure,  And I expect it is also higher because of the extra security
precautions against side attacks, etc.



> >  For exqmple, if the external device is a
> > disk, the data from memory may be gathere from multiple locations,
> > is serialized, perhaps encoded (i.e. 8b10b), has (perhaps several
> > levels) of ECC added, etc.  Viewing it like that makes encryption
> > one of many steps along the I/O pipeline.  Under that view,
> > Encryption is an option, probably controllede by some bits in the
> > I/O mechanism, not as a separate device requiring interrupt support
> > etc.
> 
> In the Cavium crypto-enabled DPUs, the crypto block is inserted
> into the data-path where necessary, when necessary; and to the extent
> that a streaming protocol/alg is used, will encrypt/decrypt as the
> data is passing from the ingress point to the egress point (which
> could be another external port, or an on-board CPU).  It can also be
> used as a stand-alone crypto accelerator by the on-board CPUs.


Good to know.  Proof of concept for my suggestion.  :-)  Can you talk
about advantages/disadvantages of that mechanism versus other
implementations?



> 
> Note that crypto is used for more than just data
> encryption/decryption; there's also digesting and digital signatures
> which rely on asymmetric algorithms such as RSA or EC and don't
> necessarily fit into the "path to the I/O device" model you've
> espoused.

Yes, of course.  But I think digital signature creation/verification
could be fit into the streaming model.  Is that wrong?  With regard to
RSA/EC, etc.  I absolutely agree.


I do want to thank you for indulging my fantasies.  :-)



-- 
 - Stephen Fuld 
(e-mail address disguised to prevent spam)