Article <vco0ik$20e64$1@dont-email.me>

Deutsch English Français Italiano
<vco0ik$20e64$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Brett <ggtgp@yahoo.com>
Newsgroups: comp.arch
Subject: Re: Is Intel exceptionally unsuccessful as an architecture designer?
Date: Sun, 22 Sep 2024 02:48:52 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 121
Message-ID: <vco0ik$20e64$1@dont-email.me>
References: <memo.20240913205156.19028s@jgd.cix.co.uk>
 <vcd3ds$3o6ae$2@dont-email.me>
 <2935676af968e40e7cad204d40cafdcf@www.novabbs.org>
 <vcd7pr$3op6a$3@dont-email.me>
 <a20365f1bdcad769edd9e1f840edb2fe@www.novabbs.org>
 <vcda96$3p3a7$2@dont-email.me>
 <21028ed32d20f0eea9a754fafdb64e45@www.novabbs.org>
 <RECGO.45463$xO0f.22925@fx48.iad>
 <20240918190027.00003e4e@yahoo.com>
 <vcfp2q$8glq$5@dont-email.me>
 <jwv34lumjz7.fsf-monnier+comp.arch@gnu.org>
 <vckpkg$18k7r$2@dont-email.me>
 <vckqus$18j12$2@dont-email.me>
 <920c561c4e39e91d3730b6aab103459b@www.novabbs.org>
 <vcl6i6$1ad9e$1@dont-email.me>
 <d3b9fc944f708546e4fbe5909c748ba3@www.novabbs.org>
 <vclb16$1etc7$1@dont-email.me>
 <vcmssa$1lpa4$1@dont-email.me>
 <vcna2k$1nlod$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 22 Sep 2024 04:48:52 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="ae1c10eca820d9baf7d2f47664a3d416";
	logging-data="2111684"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18poiQffzUvb0K8OhWeECZD"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:cHt4SxzN/aLWrfaruxg0zNsO4zs=
	sha1:vfE+KFYn2zfNGsh8/65+REEfxA4=

Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
> On 9/21/2024 9:39 AM, Brett wrote:
>> Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
>>> On 9/20/2024 6:48 PM, MitchAlsup1 wrote:
>>>> On Sat, 21 Sep 2024 1:12:38 +0000, Brett wrote:
>>>> 
>>>>> MitchAlsup1 <mitchalsup@aol.com> wrote:
>>>>>> On Fri, 20 Sep 2024 21:54:36 +0000, Chris M. Thomasson wrote:
>>>>>> 
>>>>>>> On 9/20/2024 2:32 PM, Lawrence D'Oliveiro wrote:
>>>>>>>> On Fri, 20 Sep 2024 11:21:52 -0400, Stefan Monnier wrote:
>>>>>>>> 
>>>>>>>>>> The basic issue is:
>>>>>>>>>> * CPU+motherboard RAM -- usually upgradeable
>>>>>>>>>> * Addon coprocessor RAM -- usually not upgradeable
>>>>>>>>> 
>>>>>>>>> Maybe the RAM of the "addon coprocessor" is not upgradeable, but the
>>>>>>>>> addon board itself can be replaced with another one (one with more
>>>>>>>>> RAM).
>>>>>>>> 
>>>>>>>> Yes, but that’s a lot more expensive.
>>>>>>> 
>>>>>>> I had this crazy idea of putting cpus right on the ram. So, if you add
>>>>>>> more memory to your system you automatically get more cpu's... Think
>>>>>>> NUMA for a moment... ;^)
>>>>>> 
>>>>>> Can software use the extra CPUs ?
>>>>>> 
>>>>>> Also note: DRAMs are made on P-Channel process (leakage) with only a few
>>>>>> layer of metal while CPUs are based on a N-Channel process (speed) with
>>>>>> many layers of metal.
>>>>> 
>>>>> Didn’t you work on the MC68000 which had one layer of metal?
>>>> 
>>>> Yes, but it was the 68020 and had polysilicide which we used as
>>>> a second layer of metal.
>>>> 
>>>> Mc88100 had 2 layers of metal and silicide.
>>>> 
>>>> The number of metal layers went about::
>>>> 1978: 1
>>>> 1980: 1+silicide
>>>> 1982: 2+silicide
>>>> 1988: 3+silicide
>>>> 1990: 4+silicide
>>>> 1995: 6
>>>> ..
>>>> 
>>>>> This could be fine if you are going for the AI market of slow AI cpu
>>>>> with huge memory and bandwidth.
>>>>> 
>>>>> The AI market is bigger than the general server market as seen in
>>>>> NVidea’s sales.
>>>>> 
>>>>>> Bus interconnects are not setup to take a CPU cache miss from one
>>>>>> DRAM to a different DRAM on behalf of its contained CPU(s).
>>>>>> {Chicken and egg problem}
>>>> 
>>>> Thus a problem with the CPU on DRAM approach.
>>> 
>>> It would be HIGHLY local wrt its processing units and its memory for
>>> they would all be one.
>>> 
>>> The programming for it would not be all that easy... It would be like a
>>> NUMA where a program can divide itself up and run parts of itself on
>>> each slot (aka memory-cpu hybrid unit card if you will). If a program
>>> can be embarrassingly parallel, well that would be great! The Cell
>>> processors comes to mind. But it failed. Shit.
>> 
>> Cell was in the PlayStation which Sony sold a huge number of and made
>> billions of dollars, so successful, not failed.
> 
> Touche! :^)
> 
> However, iirc, not all the games for it even used the SPE's. Instead 
> they used the PPC. I guess that might have been due to the "complexity" 
> of the programming? Not sure.

ALL games used the SPE’s, the PPC was not fast enough for a AAA game.
SPE is more powerful and flexible than a vertex shader on the graphics
chip.

>> I programmed for Cell, it was actually a nice architecture for what it did.
> 
> Iirc, you had to use DMA to communicate with the SPE's?

You have to built DMA lists for the graphics chip anyway, the SPE’s are
just more of the same. Today the vertex shaders are on the graphics chip,
instead of SPE, same difference.

>> If you think programming for AI is easy, I have news for you…
>> 
>> Those NVidia AI chips are at the brain damaged level for programming.
> 
> No shit? I was thinking along the lines of compute shaders in the GPU?
> 
> 
>> 10’s of billions of dollars are invested in this market.
>> 
>>> A system with a mother board that has slots for several GPUS (think
>>> crossfire) and slots for memory+CPU units. The kicker is that adding
>>> more memory gives you more cpus...
>>> 
>>> How crazy is this? Well, on a scale from:
>>> 
>>> Retarded to Moronic?
>>> 
>>> Pretty bad? Shit...
>>> 
>>> Shit man, remember all of the slots in the old Apple IIgs's?
>>> 
>>> ;^o
>>> 
>>> 
>>>> 
>>>>> Such a dram would be on the PCIE busses, and the main CPU’s would barely
>>>>> touch that ram, and the AI only searches locally.
>>>> 
>>>> Better make it PCIe+CXL so the downstream CPU is cache coherent.