Article <uuks6s$7p08$1@dont-email.me>

Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <uuks6s$7p08$1@dont-email.me>
Deutsch English Français Italiano
<uuks6s$7p08$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Re: "Mini" tags to reduce the number of op codes
Date: Wed, 3 Apr 2024 19:27:59 -0500
Organization: A noiseless patient Spider
Lines: 217
Message-ID: <uuks6s$7p08$1@dont-email.me>
References: <uuk100$inj$1@dont-email.me> <uukduu$4o4p$1@dont-email.me>
 <e915303b53f3b4099ff254a4dcdfbe17@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 04 Apr 2024 00:28:13 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7a85b7280e08e1d7944c412aa4f1d5d9";
	logging-data="254984"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/uAVUDSi6Y3+T5xBWHhAz+9+BnqJQEpZ8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:rRpZoXPUj62iOM64UWvqwo+7guo=
In-Reply-To: <e915303b53f3b4099ff254a4dcdfbe17@www.novabbs.org>
Content-Language: en-US
Bytes: 9209

On 4/3/2024 4:53 PM, MitchAlsup1 wrote:
> BGB-Alt wrote:
> 
>>
>> FWIW:
>> This doesn't seem too far off from what would be involved with dynamic 
>> typing at the ISA level, but with many of same sorts of drawbacks...
> 
> 
> 
>> Say, for example, top 2 bits of a register:
>>    00: Object Reference
>>      Next 2 bits:
>>        00: Pointer (with type-tag)
>>        01: ?
>>        1z: Bounded Array
>>    01: Fixnum (route to ALU)
>>    10: Flonum (route to FPU)
>>    11: Other types
>>      00: Smaller value types
>>        Say: int/uint, short/ushort, ...
>>      ...
> 
> So, you either have 66-bit registers, or you have 62-bit FP numbers ?!?
> This solves nobody's problems; not even LISP.
> 

Yeah, there is likely no way to make this worthwhile...




>> One issue:
>> Decoding based on register tags would mean needing to know the 
>> register tag bits at the same time the instruction is being decoded. 
>> In this case, one is likely to need two clock-cycles to fully decode 
>> the opcode.
> 
> Not good. But what if you don't know the tag until the register is 
> delivered from a latent FU, do you stall DECODE, or do you launch and 
> make the instruction
> queue element have to deal with all outcomes.
> 

It is likely that the pipeline would need to stall until results are 
available.

It is also likely that such a CPU would have a minimum effective latency 
of 2 or 3 clock cycles for *every* instruction (and probably 4 or 5 
cycles for memory load), in addition to requiring pipeline stalls.


>> ID1: Unpack instruction to figure out register fields, etc.
>> ID2: Fetch registers, specialize variable instructions based on tag bits.
> 
>> For timing though, one ideally doesn't want to do anything with the 
>> register values until the EX stages (since ID2 might already be tied 
>> up with the comparably expensive register-forwarding logic), but 
>> asking for 3 cycles for decode is a bit much.
> 
>> Otherwise, if one does not know which FU should handle the operation 
>> until EX1, this has its own issues. 
> 
> Real-friggen-ely
> 

These issues could be a deal-breaker for such a CPU.


>>                                     Or, possible, the FU's decide 
>> whether to accept the operation:
>>    ALU: Accepts operation if both are fixnum, FPU if both are Flonum.
> 
> What if IMUL is performed in FMAC, IDIV in FDIV,... Int<->FP routing is
> based on calculation capability {Even CDC 6600 performed int × in the FP 
> × unit (not in Thornton's book, but via conversation with 6600 logic
> designer at Asilomar some time ago. All they had to do to get FP × to
> perform int × was disable 1 gate.......)
> 

Then you have a mess...

So, probably need to sort it out before EX in any case.


>> But, a proper dynamic language allows mixing fixnum and flonum with 
>> the result being implicitly converted to flonum, but from the FPU's 
>> POV, this would effectively require two chained FADD operations (one 
>> for the Fixnum to Flonum conversion, one for the FADD itself).
> 
> That is a LANGUAGE problem not an ISA problem. SNOBOL allowed one to add
> a string to an integer and the string would be converted to int before.....
> 

If you have dynamic types in hardware in this way, then effectively the 
typesystem mechanics switch from being a language issue to a hardware issue.


One may also end up with, say, a CPU that can run Scheme or JavaScript 
or similar, but likely couldn't run C without significant hassles.



>> Many other cases could get hairy, but to have any real benefit, the 
>> CPU would need to be able to deal with them. In cases where the 
>> compiler deals with everything, the type-tags become mostly moot (or 
>> potentially detrimental).
> 
> You are arguing that the added complexity would somehow pay for itself.
> I can't see it paying for itself.
> 

One either goes all in, or abandons the idea entirely.
There isn't really a middle option in this scenario (then one just ends 
up with something that is bad at everything).

I was not saying it could work, but in a way, pointing out the issues 
that would likely make this unworkable.


Though, that said, there could be possible merit in a CPU core that 
could run a language like ECMAScript at roughly C like speeds, even if 
it was basically unusable for pretty much anything else.

Though, for ECMAScript, also make a case for taking the SpiderMonkey 
option and largely abandoning the use of an integer ALU (instead running 
all of the integer math through the FPU; which could be modified to 
support bitwise integer operations and similar as well).


>> But, then, there is another issue:
>>    C code expects C type semantics to be respected, say:
>>      Signed int overflow wraps at 32 bits (sign extending);
> maybe
>>      Unsigned int overflow wraps at 32 bits (zero extending);
> maybe

I am dealing with some code that has a bad habit of breaking if integer 
overflows don't happen in the expected ways (say, the ROTT engine is 
pretty bad about this one...).

When I first started working on my ROTT port, there was also a lot of 
wackiness where the engine would go out of bounds, then behavior would 
depend on what other things in memory it encountered when it did so.


I have mostly managed to fix up all the out-of-bounds issues, but this 
isn't enough to keep the demo's from desyncing (a similar issue applies 
with my Doom port).

Apparently, other engines like ZDoom and similar needed to do a bit of 
"heavy lifting" to get the demos from all of the various WAD versions to 
play without desync; as Doom was also dependent on the behavior of 
out-of-bounds memory accesses, and it was needed to turn these into 
in-bounds accesses (to larger memory objects) with the memory contents 
of the out-of-bounds accesses being faked.

Of course, the other option is just to "fix" the out-of-bounds accesses, 
and live with a port where the demo playback desyncs.



Meanwhile, Quake entirely avoided this issue:
The demo playback is based on recording the location and orientation of 
the player and any enemies at every point in time and similar, rather 
than based on recording and replaying the original sequence of keyboard 
inputs (and assuming that everything always happens exactly the same 
each time).


Then again, these sorts of issues are not unique to these games. Have 
watched more than a few speed-runs involving using glitches either to 
leave the playable parts of the map, or using convoluted sequences of 
actions to corrupt memory in such a way as to achieve a desired effect 
(such as triggering a warp to the end of the game).

Like, during normal gameplay, these games are seemingly just sorta 
corrupting memory all over the place but, for the most part, no one 
========== REMAINDER OF ARTICLE TRUNCATED ==========