Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Stephen Fuld" Newsgroups: comp.arch Subject: Re: Byte Addressability And Beyond Date: Wed, 5 Jun 2024 13:34:25 -0000 (UTC) Organization: A noiseless patient Spider Lines: 85 Message-ID: References: <2024May29.090435@mips.complang.tuwien.ac.at> <20240603132227.00004e0f@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Injection-Date: Wed, 05 Jun 2024 15:34:25 +0200 (CEST) Injection-Info: dont-email.me; posting-host="61d35cd0bc070ff15be1578b8eb57b77"; logging-data="1049613"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/MlKP0t5QFTGBEXDMz3wPTsMyJIc3GkVI=" User-Agent: XanaNews/1.21-f3fb89f (x86; Portable ISpell) Cancel-Lock: sha1:bkF2Byhnx3TF1ApK+iqQaQhOtEk= Bytes: 5154 Terje Mathisen wrote: > Stephen Fuld wrote: > > Terje Mathisen wrote: > > > > > Stephen Fuld wrote: > > > > Scott Lurndal wrote: > > > > > >>>>Michael S writes: > > > > > > On Mon, 3 Jun 2024 08:03:53 -0000 (UTC) > >>>>>Lawrence D'Oliveiro wrote: > > > > > > > > > > > > > On Thu, 30 May 2024 18:31:46 +0000, MitchAlsup1 wrote: > > > > > > > =20 > > > > > > > > 30 years ago you could say the same thing about > > > > > > > > encryption. =20 > > > > > > > =20 > > > > > > > I don=E2=80=99t think newer CPUs have been optimized for > > > > > encryption. Inst= > > > > > > ead, > > > > > > > we see newer encryption algorithms (or ways of using them) > > > > > > > that > >>>>work >> better on current CPUs.=20 > > > > > > > > > > > > I think moderate efficiency on CPU, not too low, but not > > > > > > high either, is a requirement for (symmetric-key) cipher. > > > > > > Esp. when the key is 128-bit or shorter. > > > > > > > > > > Most modern CPUs have instruction set support for symmetric > > > > > ciphers such as AES, SM2/SM3 as well as message digest/hash > > > > > (SHA1, SHA256 et al). > > > > > > > > > > High throughput encryption has been done by hardware > > > > > accelerators for decades now (e.g. bbn or ncypher HSM boxes > > > > > sitting on a SCSI bus; now such HSM are an integral part of > > > > > many SoC). > > > > > > > > > > > > Queston. For a modern general purpose CPU, if you are > > > > including all the logic to implement encryption instructions, > > > > is it much more to include the control/sequencing logic to do > > > > it and not tie up the rest of the CPU logic to do the > > > > encryption? Furthermore, an "inbuilt" accelerator could > > > > interface directly with the I/O hardware of the CPU (e.g. PCI), > > > > saving the "intermediate" step of writing the encrypted data to > > > > memory. > > > > > > That logic already exists, in the form of a single thread/core > > > dedicated to the job. > > > > > > With 30-100 cores on a single die, it becomes very cheap to > > > dedicate one of them to babysit such a process, compared to the > > > cost of making a custom chunk of VLSI to do the same. This is > > > particularly true because the logic needed in the babysitting > > > process is mostly straight line, with a very limited number of > > > hard-to-predict branches. > > > > > > I.e. h.264 CABAC decoding has three branches per bit decoded, at > > > least one of them impossible to predict or work around with clever > > > coding. Here it makes perfect sense to have a chunk of hw to > > > handle the heavy lifting. Monitoring block encryption/decryption > > > not so much. > > > > > > I may be missing something, but while your proposal addresses the > > first part of my proposal, I think it doesn't adress the second. > > That is, for data coming from/going to some external source, you > > are still doing "unnecessary" memory traffic, which takes memory > > bandwidth and increases latency. > > Usually, when a CPU needs to work on something, it will need to get > the data into $L1 anyway? It is only when the work is simply to be a > pipeline that having a way to bypass the CPU completely really makes > a difference, right? Right. But my point is that the CPU never really need to "work" on the encrypted data. It it frequently only sent to, or received from the network or a storage device, hence the pipelined approach has advantages. -- - Stephen Fuld (e-mail address disguised to prevent spam)