Article <hqmg2j1vbkf6suddfnsh3h3uhtkqqio4uk@4ax.com>

Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <hqmg2j1vbkf6suddfnsh3h3uhtkqqio4uk@4ax.com>

Deutsch English Français Italiano

<hqmg2j1vbkf6suddfnsh3h3uhtkqqio4uk@4ax.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: John Savard <quadibloc@servername.invalid>
Newsgroups: comp.arch
Subject: Re: Short Vectors Versus Long Vectors
Date: Tue, 23 Apr 2024 19:25:22 -0600
Organization: A noiseless patient Spider
Lines: 60
Message-ID: <hqmg2j1vbkf6suddfnsh3h3uhtkqqio4uk@4ax.com>
References: <v06vdb$17r2v$1@dont-email.me> <5451dcac941e1f569397a5cc7818f68f@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 24 Apr 2024 03:25:24 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="0739b5a5b267f63942d6ef28bfb9babb";
	logging-data="2054328"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+E4uIVKYb9dWguwpk/M2iiEtuon1vfXXM="
Cancel-Lock: sha1:v6od1lEnI0fUKieeBpClNrKrT0Q=
X-Newsreader: Forte Free Agent 3.3/32.846
Bytes: 4003

On Tue, 23 Apr 2024 02:14:32 +0000, mitchalsup@aol.com (MitchAlsup1)
wrote:

>CRAY machines stayed "in style" as long as memory latency remained smaller
>than the length of a vector (64 cycles) and fell out of favor when the cores
>got fast enough that memory could no longer keep up.
>
>I whish them well, but I expect it will not work out as they desire.....

I know that you've said this about Cray-style vectors.

I had thought the cause was much simpler. As soon as chiips like the
486 DX and then the Pentium II became available, a Cray-style machine
would have had to be implemented from smaller-scale integrated
circuits, so it would have been wildly uneconomic for the performance
it provided; it made much more sense to use off-the-shelf
microprocessors. Despite their shortcomings theoretically in
architectural terms compared to a Cray-style machine, they offered
vastly more FLOPS for the dollar.

After all, the reason the Cray I succeeded where the STAR-100 failed
was that it had those big vector registers - so it did calculations on
a register-to-register basis, rather than on a memory-to-memory basis.

That doesn't make it immune to considerations of memory bandwidth, but
that does mean that it was designed correctly for the circumstance
where memory bandwidth is an issue. So if you have the kind of
calculation to perform that is suited to a vector machine, wouldn't it
still be better to use a vector machine than a whole bunch of scalar
cores with no provision for vectors?

And if memory bandwidth issues make Cray-style vector machines
impractical, then wouldn't it be even worse for GPUs?

There are ways to increase memory bandwidth. Use HBM. Use static RAM.
Use graphics DRAM. The vector CPU of the last gasp of the Cray-style
architecture, the NEC SX-Aurora TSUBASA, is even packaged like a GPU.

Also, the original Cray I did useful work with a memory no larger than
many L3 caches these days. So a vector machine today wouldn't be as
fast as it would be if it could have, say, a 1024-bit wide data bus to
a terabyte of DRAM. That doesn't necessarily mean that such a CPU,
even when throttled by memory bandwidth, isn't an improvement over an
ordinary CPU.

Of course, though, the question is, is it an improvement enough? If
most problems anyone would want to use a vector CPU for today do
involve a large amount of memory, used in a random fashion, so as to
fit poorly in cache, then it might well be that memory bandwidth would
mean that even with a vector architecture well suited to doing a lot
of work, the net result would be only a slight improvement over what
an ordinary CPU could do with the same memory bandwidth.

I would think that a chip is still useful if it can only provide an
improvement for some problems, and that there are ways to increase
memory bandwidth from what ordinary CPUs offer, making it seem likely
that Cray-style vectors are worth doing as a way to improve what a CPU
can do.

John Savard