Path: ...!feeds.phibee-telecom.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "Chris M. Thomasson" Newsgroups: comp.arch Subject: Re: Is Intel exceptionally unsuccessful as an architecture designer? Date: Sat, 21 Sep 2024 21:12:16 -0700 Organization: A noiseless patient Spider Lines: 131 Message-ID: References: <2935676af968e40e7cad204d40cafdcf@www.novabbs.org> <21028ed32d20f0eea9a754fafdb64e45@www.novabbs.org> <20240918190027.00003e4e@yahoo.com> <920c561c4e39e91d3730b6aab103459b@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sun, 22 Sep 2024 06:12:17 +0200 (CEST) Injection-Info: dont-email.me; posting-host="0029950ff4e92ba21a7d99fa35b943c5"; logging-data="2144150"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19SUzMGI7ngrqW+N9s98SH/iK3mbigEBo0=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:9hONt0w8hJYL/DMugQsb7MdGQDo= Content-Language: en-US In-Reply-To: Bytes: 6655 On 9/21/2024 7:48 PM, Brett wrote: > Chris M. Thomasson wrote: >> On 9/21/2024 9:39 AM, Brett wrote: >>> Chris M. Thomasson wrote: >>>> On 9/20/2024 6:48 PM, MitchAlsup1 wrote: >>>>> On Sat, 21 Sep 2024 1:12:38 +0000, Brett wrote: >>>>> >>>>>> MitchAlsup1 wrote: >>>>>>> On Fri, 20 Sep 2024 21:54:36 +0000, Chris M. Thomasson wrote: >>>>>>> >>>>>>>> On 9/20/2024 2:32 PM, Lawrence D'Oliveiro wrote: >>>>>>>>> On Fri, 20 Sep 2024 11:21:52 -0400, Stefan Monnier wrote: >>>>>>>>> >>>>>>>>>>> The basic issue is: >>>>>>>>>>> * CPU+motherboard RAM -- usually upgradeable >>>>>>>>>>> * Addon coprocessor RAM -- usually not upgradeable >>>>>>>>>> >>>>>>>>>> Maybe the RAM of the "addon coprocessor" is not upgradeable, but the >>>>>>>>>> addon board itself can be replaced with another one (one with more >>>>>>>>>> RAM). >>>>>>>>> >>>>>>>>> Yes, but that’s a lot more expensive. >>>>>>>> >>>>>>>> I had this crazy idea of putting cpus right on the ram. So, if you add >>>>>>>> more memory to your system you automatically get more cpu's... Think >>>>>>>> NUMA for a moment... ;^) >>>>>>> >>>>>>> Can software use the extra CPUs ? >>>>>>> >>>>>>> Also note: DRAMs are made on P-Channel process (leakage) with only a few >>>>>>> layer of metal while CPUs are based on a N-Channel process (speed) with >>>>>>> many layers of metal. >>>>>> >>>>>> Didn’t you work on the MC68000 which had one layer of metal? >>>>> >>>>> Yes, but it was the 68020 and had polysilicide which we used as >>>>> a second layer of metal. >>>>> >>>>> Mc88100 had 2 layers of metal and silicide. >>>>> >>>>> The number of metal layers went about:: >>>>> 1978: 1 >>>>> 1980: 1+silicide >>>>> 1982: 2+silicide >>>>> 1988: 3+silicide >>>>> 1990: 4+silicide >>>>> 1995: 6 >>>>> .. >>>>> >>>>>> This could be fine if you are going for the AI market of slow AI cpu >>>>>> with huge memory and bandwidth. >>>>>> >>>>>> The AI market is bigger than the general server market as seen in >>>>>> NVidea’s sales. >>>>>> >>>>>>> Bus interconnects are not setup to take a CPU cache miss from one >>>>>>> DRAM to a different DRAM on behalf of its contained CPU(s). >>>>>>> {Chicken and egg problem} >>>>> >>>>> Thus a problem with the CPU on DRAM approach. >>>> >>>> It would be HIGHLY local wrt its processing units and its memory for >>>> they would all be one. >>>> >>>> The programming for it would not be all that easy... It would be like a >>>> NUMA where a program can divide itself up and run parts of itself on >>>> each slot (aka memory-cpu hybrid unit card if you will). If a program >>>> can be embarrassingly parallel, well that would be great! The Cell >>>> processors comes to mind. But it failed. Shit. >>> >>> Cell was in the PlayStation which Sony sold a huge number of and made >>> billions of dollars, so successful, not failed. >> >> Touche! :^) >> >> However, iirc, not all the games for it even used the SPE's. Instead >> they used the PPC. I guess that might have been due to the "complexity" >> of the programming? Not sure. > > ALL games used the SPE’s, the PPC was not fast enough for a AAA game. > SPE is more powerful and flexible than a vertex shader on the graphics > chip. Still not sure 100% of the games used the SPE's, AAA games aside for a moment... >>> I programmed for Cell, it was actually a nice architecture for what it did. >> >> Iirc, you had to use DMA to communicate with the SPE's? > > You have to built DMA lists for the graphics chip anyway, the SPE’s are > just more of the same. Today the vertex shaders are on the graphics chip, > instead of SPE, same difference. Ture. I got to play around with a Cell a long time ago. I wrote about it way back on this group a little bit. >>> If you think programming for AI is easy, I have news for you… >>> >>> Those NVidia AI chips are at the brain damaged level for programming. >> >> No shit? I was thinking along the lines of compute shaders in the GPU? >> >> >>> 10’s of billions of dollars are invested in this market. >>> >>>> A system with a mother board that has slots for several GPUS (think >>>> crossfire) and slots for memory+CPU units. The kicker is that adding >>>> more memory gives you more cpus... >>>> >>>> How crazy is this? Well, on a scale from: >>>> >>>> Retarded to Moronic? >>>> >>>> Pretty bad? Shit... >>>> >>>> Shit man, remember all of the slots in the old Apple IIgs's? >>>> >>>> ;^o >>>> >>>> >>>>> >>>>>> Such a dram would be on the PCIE busses, and the main CPU’s would barely >>>>>> touch that ram, and the AI only searches locally. >>>>> >>>>> Better make it PCIe+CXL so the downstream CPU is cache coherent. > >