Path: ...!weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Short Vectors Versus Long Vectors
Date: Wed, 24 Apr 2024 09:18:56 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 34
Message-ID: <2024Apr24.111856@mips.complang.tuwien.ac.at>
References: <v06vdb$17r2v$1@dont-email.me> <5451dcac941e1f569397a5cc7818f68f@www.novabbs.org> <hqmg2j1vbkf6suddfnsh3h3uhtkqqio4uk@4ax.com> <5ad43f26367ef2d5e8b3c298511ddf45@www.novabbs.org> <j9ah2jl3oosp9ggvdkskqai9m4nme4qkb4@4ax.com>
Injection-Date: Wed, 24 Apr 2024 11:27:45 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="68f13f15e74c6cc1e6ed32f2711e82b5";
	logging-data="2368321"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18RIXt8vp21LoR5rNzaXU8O"
Cancel-Lock: sha1:LWPTqAIGYqQT5lq/98zTy0ALvZo=
X-newsreader: xrn 10.11
Bytes: 2572

John Savard <quadibloc@servername.invalid> writes:
>On Wed, 24 Apr 2024 02:00:10 +0000, mitchalsup@aol.com (MitchAlsup1)
>wrote:
>
>>Everyone has to have hope on something.
>
>But false hopes are a waste of time.
>
>The reason for my interest in long vectors is primarily because I
>imagine that, if the Cray I was an improvement on the IBM System/360
>Model 195, then, apparently, today a chip  like the Cray I would be
>the next logical step after the Pentium II (OoO plus cache, just like
>a Model 195).

But the Cray-1 is not an improvement on the Model 195.  It has no
cache.  Neither the Cray-1 nor the Model 195 have OoO as the term is
commonly understood today: OoO execution, in-order completion,
allowing register renaming, speculative execution, and precise
exceptions.  One may consider the Model 91/195 a predecessor of
today's OoO, because it supports register renaming, and you "just"
need to add a reorder buffer to get in-order completion and
speculative execution.

>Well, apparently they do things like multiply 2048 by 2048 matrices.
>Which is why they need stride.

You can multiply dense matrices of any size efficiently with stride 1.
And caches help a lot for matrix multiply; in HPC circles, (dense)
matrix multiply is known as cache-friendly problem.

- anton
-- 
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>