Deutsch English Français Italiano |
<ad85483a2a41b704c1cbb6e796aaf9f4@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Is Intel exceptionally unsuccessful as an architecture =?UTF-8?B?ZGVzaWduZXI/?= Date: Fri, 20 Sep 2024 00:58:44 +0000 Organization: Rocksolid Light Message-ID: <ad85483a2a41b704c1cbb6e796aaf9f4@www.novabbs.org> References: <memo.20240913205156.19028s@jgd.cix.co.uk> <vcd3ds$3o6ae$2@dont-email.me> <2935676af968e40e7cad204d40cafdcf@www.novabbs.org> <vcd7pr$3op6a$3@dont-email.me> <7wCGO.45461$xO0f.1783@fx48.iad> <20240918190414.00005806@yahoo.com> <8e1aed9ce25c70cc555731140ae14eb1@www.novabbs.org> <vcfln9$836k$1@dont-email.me> <vcgi7p$fmaa$2@dont-email.me> <f6093802cde5821a88ff715b8139fc04@www.novabbs.org> <vcicir$ov66$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="2707846"; mail-complaints-to="usenet@i2pn2.org"; posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A"; User-Agent: Rocksolid Light X-Rslight-Site: $2y$10$PN17nBjTacmbLSBBpwpVxOEuqXGJ0V7bQzzPfIAlnuGdWQT4/2JEe X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 2631 Lines: 33 On Thu, 19 Sep 2024 23:37:00 +0000, Lawrence D'Oliveiro wrote: > On Thu, 19 Sep 2024 16:09:15 +0000, MitchAlsup1 wrote: > >> 400 cycles IS negligible. >> 400 cycles for each LD is non-negligible. >> >> Remember LDs are 20%-22% of the instruction stream and with 400 cycles >> per LD you see an average of 80-cycles per instruction even if all other >> instructions take 1 cycle. This is 160× SLOWER than current CPUs. But >> GPUs with thousands of cores can use memory that slow and still deliver >> big gains in performance (6×-50×). > > How can they do that? What proportion of their instruction stream is > LDs? 20%-22% (as stated above) another 10% STs. > It seems to me they are accessing memory in 100% of their instructions, > since they would have less sophisticated memory controllers than CPUs > commonly have. Maybe less sophisticated, but 20×-40× the number of 'miss buffers' than conventional CPUs. Hint:: They can context switch every instruction. So if an instruction does not complete in its cycle, they switch to a different set of threads; and they have lots of threads per core to work with. Also note: a single instruction causes 32-128 threads to make 1 step of forward progress. It is called SIMT for a reason.