Deutsch   English   Français   Italiano  
<vue615$2b0tt$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Misc (semi OT): Well, distractions...
Date: Thu, 24 Apr 2025 15:10:29 -0500
Organization: A noiseless patient Spider
Lines: 474
Message-ID: <vue615$2b0tt$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 24 Apr 2025 22:13:26 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="b27e0e55457513abd1cd51bd79dfe873";
	logging-data="2458557"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18jBwp6m6eGYUYyW+s9uxT5/2jL0WByClc="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:5g7JNaZz7hzsku8BnIT73ZJQwes=
Content-Language: en-US
Bytes: 21460

So, recently, have been distracted on other stuff.
My ISA project has mostly stabilized, and the main things remaining 
going on here are basically going right up the side of a mountain in 
terms of difficulty curve. Like, as quickly as "low hanging fruit" tasks 
get done, whatever is left becomes progressively more difficult (and to 
push into "more than just a toy" areas will likely require far more work 
than I can reasonably pull off in a reasonable timeframe).


I was also starting to miss tinkering around with 3D engine stuff some, 
so, yeah...


But, a recent line of fiddling has gone in an odd direction.
   I felt a need for a basic script language for some tasks;
   I wanted to keep code footprint low;
     My past script interpreters had a lot more code footprint.
   I didn't want any external dependencies.
     Using something like Lua would mean a dependency on a library.
     Despite claims of being "small", Lua is still quite large...
       Like, it is "small" if compared with Python or V8 or something.
       But, still like around twice as much code as the Doom engine...


So, I went for a BASIC variant:
It is possible to implement a BASIC interpreter with fairly little code.
Or, at least, for an 80s style dialect.
There was some stuff in early BASIC's that did not make sense for use in 
a 3D engine, so I left it out.

The core dialect was a fairly limited one:
   i=0
   label:
   i=i+1
   if i<10 goto label  //THEN is optional with GOTO

So, no real loops or other block-structured control-flow, as these would 
essentially require a more complex interpreter.

I ended up making 'THEN' the optional keyword here, rather than 'GOTO' 
as "IF cond GOTO label" is unambiguous, whereas "IF cond THEN label" 
requires figuring out whether it is a label or a command.

For now, if one wants a block structured IF/THEN/ELSE/ENDIF, they need 
to build it themselves with GOTO.



There is GOSUB/RETURN, but this was essentially a GOTO that remembers 
the return path on a small internal stack. So RETURN would effectively 
go to the line following the location where GOSUB was called.

General interpreter structure works by reading in the code, and breaking 
it into lines and tokens.
Interpreter then walks code line-by-line, directly driving logic based 
on the tokens it encounters.

Not really efficient, but smaller code footprint.
I was torn between "tokenize then interpret", or "leave code as big blob 
of ASCII text and then parse each line one at a time". I ended up going 
with pre-tokenization as it added a few perks, is slightly less slow, 
and possibly even helped overall regarding code footprint (while the 
initial loading step is more code, the rest of the parsing logic is less 
code).



Initially, I was using global-only scoping, but it doesn't take long to 
realize that only having global variables raises problems:
Every piece of code needs to use its own variable names to not stomp 
code elsewhere;
Recursive code is essentially impossible.

Most lazy solution to this:
   Add dynamic scoping.
     TEMP i=3
   Creates 'i' only within the scope of the current GOSUB.
   RETURN happens, variable goes away.
   Inner scopes mask outer scopes, without interfering;
   This allows recursion, and some amount of sanity regarding variables.

Dynamic scoping is, however, not the direction that the other BASIC's 
went. For example, QBASIC had went over to block-structuring and lexical 
scoping. However, for these, would likely need a "real" parser, and not 
just something that works by reading lines one at a time with an option 
of GOTO.

The other option would have been a LOCAL keyword being used for dynamic 
variables.


But, then I realized I wanted return values in some cases, so:
   x = GOSUB label
   ...
   label:
   ...
   RETURN expr

Works, but is unorthodox (also breaks the function/subroutine 
distinction that is present in most other BASICs; but mostly absent in C 
family languages).

However, the GOSUB here is not a true expression, so:
   x = (GOSUB label) + 1
Is not valid.


Well, and then another bit of wonk:
   t = GOSUB label x=3, y=4

Which works by creating x and y within the callee's frame, so can pass 
arguments. Also: was the laziest option given the existing 
implementation. But, is kinda weird...

Namely, caller doesn't need to fetch the callee function or know 
anything about its argument list, as it is merely binding each variable 
within the callee's dynamic environment frame.

Dynamic scoping was less code in effect because it was merely a stack, 
and call/return marks off the stack position (along with the internal 
line number and similar).



Well, and added vectors:
   v0 = (vec 1,2,3)
Which, possibly, don't really fit in with BASIC.

But, interpreter is still around 1500 lines of C, and would likely have 
been bigger if I did this stuff "properly".


Though, now I am starting to second guess myself and wonder if it might 
have ended up being less code at this point to implement a small version 
of a language similar to Emacs Lisp.

Well, since at some point "parse tokens into S-Expressions and then walk 
the S-Expression lists" becomes less code than "read line, break into 
tokens, and walk over tokens and dispatch logic based on said tokens" 
once one moves past a certain level of triviality.

And, with "vec", the difference between "(vec 1,2,3)" and (vec 1 2 3) is 
not exactly large. I had also been half tempted to use [1,2,3] syntax, 
as it was going the route (like in OpenSCAD) that arrays and vectors are 
basically the same thing.

In this case, interpreter infers if you do:
   vec3=vec1+vec2
And, vec1 and vec2 are arrays of the same length, then to do a vector 
operation.

....


Well, and I end up trying to implement an 80s style BASIC and within a 
week already starts mutating into some sort of weird hybrid of 80s style 
BASIC and Emacs Lisp...

Well, and to add to this, I had also considered adding CSG operations, 
to allow using it in a vaguely similar way to OpenSCAD, for 3D modeling 
uses.
   box1=(csgaabb (vec -10,-10,0),(vec 10,10,4))
   box2=(csgaabb (vec -1,-1,4),(vec 1,1,40))
   base1=(csgunion box1,box2)
   ...

But, again, this wouldn't be that different from, say:
   (let box1 (csgaabb (vec -10 -10 0) (vec 10 10 4)))
   (let box2 (csgaabb (vec -1 -1 4) (vec 1 1 40)))
   (let base1 (csgunion box1 box2))

Though, with a Lisp variant, would have likely ended up with more proper 
functions by this point:
   (defun func (x y) (+ x y))
....

But, errm...


Though, can note that in G-Code, there was a trick that can be used to 
encode block-structuring without the need to break from line-by-line 
========== REMAINDER OF ARTICLE TRUNCATED ==========