Deutsch English Français Italiano |
<f2d99c60ba76af28c8b63b9628fb56fa@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Tonights Tradeoff Date: Wed, 11 Sep 2024 21:27:21 +0000 Organization: Rocksolid Light Message-ID: <f2d99c60ba76af28c8b63b9628fb56fa@www.novabbs.org> References: <vbgdms$152jq$1@dont-email.me> <17537125c53e616e22f772e5bcd61943@www.novabbs.org> <vbj5af$1puhu$1@dont-email.me> <a37e9bd652d7674493750ccc04674759@www.novabbs.org> <vbog6d$2p2rc$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="1708536"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Rslight-Site: $2y$10$bdHY.2.JTkQdFBl9IJQOLu/MIIpZ/oDUe8C04ej3u.TXQfCr2wafa X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 5758 Lines: 94 On Tue, 10 Sep 2024 3:59:05 +0000, Robert Finch wrote: > On 2024-09-08 2:06 p.m., MitchAlsup1 wrote: >> On Sun, 8 Sep 2024 3:22:55 +0000, Robert Finch wrote: >> >>> On 2024-09-07 10:41 a.m., MitchAlsup1 wrote: >>>> On Sat, 7 Sep 2024 2:27:40 +0000, Robert Finch wrote: >>>> >>>>> Making the scalar register file a subset of the vector register file. >>>>> And renaming only vector elements. >>>>> >>>>> There are eight elements in a vector register and each element is >>>>> 128-bits wide. (Corresponding to the size of a GPR). Vector register >>>>> file elements are subject to register renaming to allow the full power >>>>> of the OoO machine to be used to process vectors. The issue is that >>>>> with >>>>> both the vector and scalar registers present for renaming there are a >>>>> lot of registers to rename. It is desirable to keep the number of >>>>> renamed registers (including vector elements) <= 256 total. So, the 64 >>>>> scalar registers are aliased with the first eight vector registers. >>>>> Leaving only 24 truly available vector registers. Hm. There are 1024 >>>>> physical registers, so maybe going up to about 300 renamable register >>>>> would not hurt. >>>> >>>> Why do you think a vector register file is the way to go ?? >>> >>> I think vector use is somewhat dubious, but they have some uses. In many >>> cases data can be processed just fine without vector registers. In the >>> current project vector instructions use the scalar functional units to >>> compute, making them no faster than scalar calcs. But vectors have a lot >>> of code density where parallel computation on multiple data items using >>> a single instruction is desirable. I do not know why people use vector >>> registers in general, but they are present in some modern architectures. >> >> There is no doubt that much code can utilize vector arrangements, and >> that a processor should be very efficient in performing these work >> loads. >> >> The problem I see is that CRAY-like vectors vectorize instructions >> instead of vectorizing loops. Any kind of flow control within the >> loop becomes tedious at best. >> >> On the other hand, the Virtual Vector Method vectorizes loops and >> can be implemented such that it performs as well as CRAY-like >> vector machines without the overhead of a vector register file. >> In actuality there are only 6-bits of HW flip-flops governing >> VVM--compared to 4 KBytes for CRAY-1. >> >>> Qupls vector registers are 512 bits wide (8 64-bit elements). Bigfoot’s >>> vector registers are 1024 bits wide (8 128-bit elements). >> >> When properly abstracted, one can dedicate as many or few HW >> flip-flops as staging buffers for vector work loads to suit >> the implementation at hand. A GBOoO may utilize that 4KB >> file of CRAY-1 while the little low power core 3-cache lines. >> Both run the same ASM code and both are efficient in their own >> sense of "efficient". >> >> So, instead of having ~500 vector instructions and ~1000 SIMD >> instructions one has 2 instructions and a medium scale state >> machine. >> > > > Still trying to grasp the virtual vector method. Been wondering if it > can be implemented using renamed registers. Think of VVM as a set (8) of staging flip-flops taking data (line) from L1 and feeding it into 4-wide ALUs then back into another set (4) flip-flops which deliver data to L1; with wide muxes to get the LD data aligned with the SLU and the ALU result aligned back to L1. Then support this infrastructure with a reservation station-like queue which can advance (1,2,4) iterations per clock. The registers named in the asm are named into the staging flip-flops {like renaming} and the whole thing optimized for multi-lane execution with 6-bits of total overhead. > Qupls has RISC-V style vector / SIMD registers. For Q+ every instruction > can be a vector instruction, as there are bits indicating which > registers are vector registers in the instruction. All the scalar > instructions become vector. This cuts down on some of the bloat in the > ISA. There is only a handful of vector specific instructions (about > eight I think). The drawback is that the ISA is 48-bits wide. However, > the code bloat is less than 50% as some instructions have > dual-operations. Branches can increment or decrement and loop. Bigfoot > uses a postfix word to indicate to use the vector form of the > instruction. Bigfoot’s code density is a lot better being variable > length, but I suspect it will not run as fast. Bigfoot and Q+ share a > lot of the same code. Trying to make the guts of the cores generic. Too bad...