Deutsch English Français Italiano |
<5451dcac941e1f569397a5cc7818f68f@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder6.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Short Vectors Versus Long Vectors Date: Tue, 23 Apr 2024 02:14:32 +0000 Organization: Rocksolid Light Message-ID: <5451dcac941e1f569397a5cc7818f68f@www.novabbs.org> References: <v06vdb$17r2v$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="2064526"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Rslight-Site: $2y$10$ICEjUVF4s1/6zQwPUGtMR.PRJumFH3UHcQsPTyFVfNq.fF5zkMVZi Bytes: 3459 Lines: 49 Lawrence D'Oliveiro wrote: > Adding the typical kind of vector-processing instructions to an > instruction set inevitably leads to a combinatorial explosion in the > number of opcodes. This kind of thing makes a mockery of the “R” in > “RISC”. It does indeed make a mockery of the R in RISC. > Interesting to see that the RISC-V folks are staying off this path; > instead, they are reviving an old idea from Seymour Cray’s original > machines that bear his name: a vector pipeline. Instead of being limited > to processing 4 or 8 operands at a time, the Cray machines could operate > (sequentially, but rapidly) on variable-length vectors of up to 64 > elements with a single setup sequence. RISC-V seems to make the limit on > vector length an implementation choice, with a value of 32 being mentioned > in the spec. CRAY machines stayed "in style" as long as memory latency remained smaller than the length of a vector (64 cycles) and fell out of favor when the cores got fast enough that memory could no longer keep up. I whish them well, but I expect it will not work out as they desire..... > The way it avoids having separate instructions for each combination of > operand types is to have operand-type registers as part of the vector > unit. This way, only a small number of instructions is required to set up > all the combinations of operand/result types. You then give it a kick in > the guts and off it goes. > Detailed spec here: > <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc>. On the other hand, My 66000 has support for both SIMD and CRAY-like vectors and the ISA contains only 6-bits of state supporting vectorization and exactly 2 instructions--one that gives HW a register it can use in the "loop" and the LOOP instruction that performs the ADD-CMP-BC functionality. {{Not 2 for every kind of vectorized instruction, 2 total instructions}} There is nor 4KB of register file (context switch overhead), there is no need for Gather/Scatter, stride memory references, there is no masking register, the OS can use vectorization for small fast loops without overhead, the compiler does not have to solve memory address aliasing, cache activities are modified to suit vector workloads, exotic HW can execute across multiple lanes (as desired), simple HW can "do it all" in a 1-wide pipeline, the debugger presents scalar code to coder, and exceptions remain precise (for those that care), and the exception handler(s) sees only scalar code.