Deutsch English Français Italiano |
<vedg1s$43mp$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Robert Finch <robfi680@gmail.com> Newsgroups: comp.arch Subject: Re: Tonights Tradeoff - Carry and Overflow Date: Sat, 12 Oct 2024 05:38:01 -0400 Organization: A noiseless patient Spider Lines: 107 Message-ID: <vedg1s$43mp$1@dont-email.me> References: <vbgdms$152jq$1@dont-email.me> <vbog6d$2p2rc$1@dont-email.me> <f2d99c60ba76af28c8b63b9628fb56fa@www.novabbs.org> <vc61e6$21skv$1@dont-email.me> <vc8gl4$2m5tp$1@dont-email.me> <vcv5uj$3arh6$1@dont-email.me> <37067f65c5982e4d03825b997b23c128@www.novabbs.org> <vd352q$3s1e$1@dont-email.me> <5f8ee3d3b2321ffa7e6c570882686b57@www.novabbs.org> <vd6a5e$o0aj$2@dont-email.me> <vdnpg4$3c9e$2@dont-email.me> <2024Oct4.081931@mips.complang.tuwien.ac.at> <vdp343$9d38$1@dont-email.me> <2024Oct5.114309@mips.complang.tuwien.ac.at> <ve5mpq$2jt5k$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sat, 12 Oct 2024 11:38:04 +0200 (CEST) Injection-Info: dont-email.me; posting-host="08e3a93de117abaa6a6a445b83662458"; logging-data="134873"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Aic68P6B8Bk9DJgGgLPRXhPm+EYVf6sk=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:eDCOuRJ/Z+wzQiWMR0vZ+WxHL3M= In-Reply-To: <ve5mpq$2jt5k$1@dont-email.me> Content-Language: en-US Bytes: 6856 On 2024-10-09 6:44 a.m., Robert Finch wrote: > On 2024-10-05 5:43 a.m., Anton Ertl wrote: >> Robert Finch <robfi680@gmail.com> writes: >>> On 2024-10-04 2:19 a.m., Anton Ertl wrote: >>>> 4) Keep the flags results along with GPRs: have carry and overflow as >>>> bit 64 and 65, N is bit 63, and Z tells something about bits 0-63. >>>> The advantage is that you do not have to track the flags separately >>>> (and, in case of AMD64, track each of C, O, and NZP separately), but >>>> instead can use the RAT that is already there for the GPRs. You can >>>> find a preliminary paper on that on >>>> <https://www.complang.tuwien.ac.at/anton/tmp/carry.pdf>. >> ... >>> One solution, not mentioned in your article, is to support arithmetic >>> with two bits less than the number of bit a register can support, so >>> that the carry and overflow can be stored. On a 64-bit machine have all >>> operations use only 62-bits. It would solve the issue of how to load or >>> store the carry and overflow bits associated with a register. >> >> Yes, that's a solution, but the question is how well existing software >> would react to having no int64_t (and equivalent types, such as long >> long), but instead an int62_t (or maybe int63_t, if the 64th bit is >> used for both signed and unsigned overflow, by having separate signed >> and unsigned addition etc.). I expect that such an architecture would >> have low acceptance. By contrast, in my paper I suggest an addition >> to existing 64-bit architectures that has fewer of the same >> disadvantages as the widely-used condition-code-register approach has, >> but still has a few of them. >> >>> Sometimes >>> arithmetic is performed with fewer bits, as for pointer representation. >>> I wonder if pointer masking could somehow be involved. It may be useful >>> to have a bit indicating the presence of a pointer. Also thinking of how >>> to track a binary point position for fixed point arithmetic. Perhaps >>> using the whole upper byte of a register for status/control bits >>> would work. >> >> There are some extensions for AMD64 in that direction. >> >>> It may be possible with Q+ to support a second destination register >>> which is in a subset of the GPRs. For example, one of eight registers >>> could be specified to holds the carry/overflow status. That effectively >>> ties up a second ALU though as an extra write port is needed for the >>> instruction. >> >> Needing only one write port is an advantage of my approach. >> >> - anton > > Been thinking some about the carry and overflow and what to do about > register spills and reloads during expression processing. My thought was > that on the machine with 256 registers, simply allocate a ridiculous > number of registers for expression processing, for example 25 or even > 50. Then if the expression is too complex, have the compiler spit out an > error message to the programmer to simplify the expression. Remnants of > the ‘expression too complex’ error in BASIC. So, there are no spills or > reloads during expression processing. I think the storextra / loadextra > registers used during context switching would work okay. But in Q+ there > are 256 regs which require eight storextra / loadextra registers. I > think the store extra / load extra registers could be hidden in the > context save and restore hardware. Not even requiring access via CSRs or > whatever. I suppose context loads and stores could be done in blocks of > 32 registers. An issue is that the load extra needs to be done before > registers are loaded. So, the extra word full of carry/overflow bits > would need to be fetched in a non-sequential fashion. Assuming for > instance, that saving register values is followed by a save of the CO > word. Then it is positioned wrong for a sequential load. It may be > better to have the wrong position for a store, so loads can proceed > sequentially. > It strikes me that there is no real good solution, only perhaps an > engineered one. Toyed with the idea of having 16 separate flags > registers, but not liking that as a solution as much as the store/load > extra. > > Another thought is to store additional info such as a CRC check of the > register file on context save and restore. > > ***** > > Finally wrote the SM to walk the ROB backwards and restore register > mappings for a checkpoint restore. Cannot get Q+ to do more than light > up one LED in SIM. Register values are not propagating properly. > > Mulled over carry and overflow in arithmetic operations. Looked at widening the datapath to 66-bits to hold carry and overflow bits. Thinking it may increase the size of the design by over 3% just to support carry and overflow. For now, an instruction, ADDGC, was added to generate the carry bit as a result. A 256-bit add looks like: ; 256 bit add ; A = r1,r2,r3,r4 ; B = r5,r6,r7,r8 ; S = r9,r10,r11,r12 add r9,r1,r5,r0 addgc r13,r1,r5,r0 add r10,r2,r6,r13 addgc r13,r2,r6,r13 add r11,r7,r3,r13 addgc r13,r7,r3,r13 add r12,r8,r4,r13 Not very elegant a solution, but it is simple. I think it requires minimal hardware. Three input ADD is already present and ADDGC just routes the carry bit to the output.