Path: ...!eternal-september.org!feeder3.eternal-september.org!i2pn.org!i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Cost of handling misaligned access
Date: Tue, 4 Feb 2025 18:58:43 +0000
Organization: Rocksolid Light
Message-ID: <12d9d216c9a094ef963217baa35793e9@www.novabbs.org>
References: <5lNnP.1313925$2xE6.991023@fx18.iad> <2025Feb2.184458@mips.complang.tuwien.ac.at> <vnocer$q8bq$1@dont-email.me> <vnr7f2$1egqs$1@dont-email.me> <vnrb15$105p$1@gal.iecc.com> <112ffb344782247afc7b5e9e36c085d5@www.novabbs.org> <s1hoP.141118$qu83.118261@fx35.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
	logging-data="2689573"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="o5SwNDfMfYu6Mv4wwLiW6e/jbA93UAdzFodw5PEa6eU";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$jh4cmoV1WxCh03i0SwwmMOw5nUx4novsonjZHcfijUaN4QtApRXUS
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Posting-User: cb29269328a20fe5719ed6a1c397e21f651bda71
Bytes: 2823
Lines: 43

On Tue, 4 Feb 2025 4:49:57 +0000, EricP wrote:

> MitchAlsup1 wrote:
>>
>> Basically, VAX taught us why we did not want to do "all that" in
>> a single instruction; while Intel 432 taught us why we did not bit
>> aligned decoders (and a lot of other things).
>
> I case people are interested...
>
> [paywalled]
> The Instruction Decoding Unit for the VLSI 432 General Data Processor,
> 1981
> https://ieeexplore.ieee.org/abstract/document/1051633/
>
> The benchmarks in table 1(a) below tell it all:
> a 4 MHz 432 is 1/15 to 1/20 the speed (slower) than a 5 MHz VAX/780,
> 1/4 to 1/7 speed than a 8 MHz 68000 or 5 MHz 8086
>
> A Performance Evaluation of The Intel iAPX 432, 1982
> https://dl.acm.org/doi/pdf/10.1145/641542.641545
>
> And the reasons are covered here:
>
> Performance Effects of Architectural Complexity in the Intel 432, 1988
> https://www.princeton.edu/~rblee/ELE572Papers/Fall04Readings/I432.pdf

From the link::
The 432’s procedure calls are quite costly. A typical procedure call
requires 16 read accesses to memory and 24 write accesses, and it
consumes 982 machine cycles. In terms of machine cycles, this makes
it about ten times as slow as a call on the MC68010 or VAX 11/780.

almost 1000 cycles just to call a subroutine !!!

Lots of thinigs teh architects got wrong in there.....

>
> Bob Colwell, one of the authors of the third paper, later joined
> Intel as a senior architect and was involved in the development of the
> P6 core used in the Pentium Pro, Pentium II, and Pentium III
> microprocessors,
> and designs derived from it are used in the Pentium M, Core Duo and
> Core Solo, and Core 2.