Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Brett Newsgroups: comp.arch Subject: Re: Is Intel exceptionally unsuccessful as an architecture designer? Date: Sat, 21 Sep 2024 16:39:39 -0000 (UTC) Organization: A noiseless patient Spider Lines: 103 Message-ID: References: <2935676af968e40e7cad204d40cafdcf@www.novabbs.org> <21028ed32d20f0eea9a754fafdb64e45@www.novabbs.org> <20240918190027.00003e4e@yahoo.com> <920c561c4e39e91d3730b6aab103459b@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Sat, 21 Sep 2024 18:39:39 +0200 (CEST) Injection-Info: dont-email.me; posting-host="f0a27f4afb941726be0b084ce579f974"; logging-data="1762628"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+F1yy6QCLlqu8XzYVfbsHW" User-Agent: NewsTap/5.5 (iPad) Cancel-Lock: sha1:rbv4vzd6POOtQ/L76k47La7+jpM= sha1:ybpIAyO+/1asbjLfNnWJJydj8DU= Bytes: 5150 Chris M. Thomasson wrote: > On 9/20/2024 6:48 PM, MitchAlsup1 wrote: >> On Sat, 21 Sep 2024 1:12:38 +0000, Brett wrote: >> >>> MitchAlsup1 wrote: >>>> On Fri, 20 Sep 2024 21:54:36 +0000, Chris M. Thomasson wrote: >>>> >>>>> On 9/20/2024 2:32 PM, Lawrence D'Oliveiro wrote: >>>>>> On Fri, 20 Sep 2024 11:21:52 -0400, Stefan Monnier wrote: >>>>>> >>>>>>>> The basic issue is: >>>>>>>> * CPU+motherboard RAM -- usually upgradeable >>>>>>>> * Addon coprocessor RAM -- usually not upgradeable >>>>>>> >>>>>>> Maybe the RAM of the "addon coprocessor" is not upgradeable, but the >>>>>>> addon board itself can be replaced with another one (one with more >>>>>>> RAM). >>>>>> >>>>>> Yes, but that’s a lot more expensive. >>>>> >>>>> I had this crazy idea of putting cpus right on the ram. So, if you add >>>>> more memory to your system you automatically get more cpu's... Think >>>>> NUMA for a moment... ;^) >>>> >>>> Can software use the extra CPUs ? >>>> >>>> Also note: DRAMs are made on P-Channel process (leakage) with only a few >>>> layer of metal while CPUs are based on a N-Channel process (speed) with >>>> many layers of metal. >>> >>> Didn’t you work on the MC68000 which had one layer of metal? >> >> Yes, but it was the 68020 and had polysilicide which we used as >> a second layer of metal. >> >> Mc88100 had 2 layers of metal and silicide. >> >> The number of metal layers went about:: >> 1978: 1 >> 1980: 1+silicide >> 1982: 2+silicide >> 1988: 3+silicide >> 1990: 4+silicide >> 1995: 6 >> .. >> >>> This could be fine if you are going for the AI market of slow AI cpu >>> with huge memory and bandwidth. >>> >>> The AI market is bigger than the general server market as seen in >>> NVidea’s sales. >>> >>>> Bus interconnects are not setup to take a CPU cache miss from one >>>> DRAM to a different DRAM on behalf of its contained CPU(s). >>>> {Chicken and egg problem} >> >> Thus a problem with the CPU on DRAM approach. > > It would be HIGHLY local wrt its processing units and its memory for > they would all be one. > > The programming for it would not be all that easy... It would be like a > NUMA where a program can divide itself up and run parts of itself on > each slot (aka memory-cpu hybrid unit card if you will). If a program > can be embarrassingly parallel, well that would be great! The Cell > processors comes to mind. But it failed. Shit. Cell was in the PlayStation which Sony sold a huge number of and made billions of dollars, so successful, not failed. I programmed for Cell, it was actually a nice architecture for what it did. If you think programming for AI is easy, I have news for you… Those NVidia AI chips are at the brain damaged level for programming. 10’s of billions of dollars are invested in this market. > A system with a mother board that has slots for several GPUS (think > crossfire) and slots for memory+CPU units. The kicker is that adding > more memory gives you more cpus... > > How crazy is this? Well, on a scale from: > > Retarded to Moronic? > > Pretty bad? Shit... > > Shit man, remember all of the slots in the old Apple IIgs's? > > ;^o > > >> >>> Such a dram would be on the PCIE busses, and the main CPU’s would barely >>> touch that ram, and the AI only searches locally. >> >> Better make it PCIe+CXL so the downstream CPU is cache coherent. > >