Deutsch English Français Italiano |
<24c5f18351cc1ba815fdbb5740a6c1d0@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Byte Addressability And Beyond Date: Wed, 5 Jun 2024 16:53:53 +0000 Organization: Rocksolid Light Message-ID: <24c5f18351cc1ba815fdbb5740a6c1d0@www.novabbs.org> References: <v0s17o$2okf4$2@dont-email.me> <v327n3$1use$1@gal.iecc.com> <BM25O.40665$HBac.4762@fx15.iad> <v32lpv$1u25$1@gal.iecc.com> <v33bqg$9cst$11@dont-email.me> <v34v62$ln01$1@dont-email.me> <v36bva$10k3v$2@dont-email.me> <2024May29.090435@mips.complang.tuwien.ac.at> <v38opv$1gsj2$3@dont-email.me> <v38rkd$1ha8a$1@dont-email.me> <jwvttifrysb.fsf-monnier+comp.arch@gnu.org> <f90b6e03c727b0f209d64484ec097298@www.novabbs.org> <v3jtd8$3qduu$2@dont-email.me> <20240603132227.00004e0f@yahoo.com> <k6k7O.8602$7jpd.5620@fx47.iad> <v3klhp$3ugeh$1@dont-email.me> <v3mljt$c63k$1@dont-email.me> <cf98e7ba84809010306e3fdea7aa103c@www.novabbs.org> <v3pe6e$u5kd$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="3286118"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$5hvGrjQZbqRu2X2TRehT8.YKftnHhNg.QbaQy54ijF.9JgohDB/ja X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 3934 Lines: 51 Terje Mathisen wrote: > MitchAlsup1 wrote: >> >> >>> I.e. h.264 CABAC decoding has three branches per bit decoded, at least >>> one of them impossible to predict or work around with clever coding. >> >> How many instructions in the then-clause and in the else-clause ?? >> If these are smaller than 8, My 66000 can process them without >> "branching" using predication. > No, the real problem is the context branching: After doing the 50% > branch you pick up one of two alternative contexts and follow totally > different paths, i.e. you cannot simply use the branch bit as an index. If the number of instructions in the combined then and else clauses is lower than a certain number, it is equally efficient to deal with the branch as if it were later nullification rather than a redirection of the fetch end of the pipeline. Here, NO prediction is required and there is no chance of misprediction without regard to the predictability of the control flow point. The whole point is that if the fetch end of the pipeline will reach the convergence point before the branch is fully resolved, then "don't branch" nullify. it saves cycles and keeps unpredictable branches out of the branch predictor--even if the apparent takenness of the branch is completely random--improving the prediction accuracy of "real branches". So, for example, let us postulate a 1-wide machine fetching 4 words per clock and a then clause of 3 instructions and an else clause of 4 inst. By the time the pseudo branch instruction enters execution, both the then and the else have already been fetched, parsed, and are flowing through decode. The execution of the branch merely decides which inst survive the pipeline and there are no misprediction stalls. {{On a wider machine, the fetch is even wider and the parse/decode BW is still higher, so the mispredicted control flow point does not suffer misprediction repair costs.}} Oddly enough, this is how predication works on My 66000. > I found ways to bypass the issues with the other two branches but this > one is fundamental. It is fundamental only on ISAs that perform predication improperly or does not have predication, or use the predictor when predicating. My 66000 is not one of them. I return to the question posed earlier:: How many instructions in the then-clause and in the else-clause ?? > Terje