Deutsch   English   Français   Italiano  
<vufvl0$3v2nq$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: Robert Finch <robfi680@gmail.com>
Newsgroups: comp.arch
Subject: Re: Misc (semi OT): Well, distractions...
Date: Fri, 25 Apr 2025 08:36:48 -0400
Organization: A noiseless patient Spider
Lines: 243
Message-ID: <vufvl0$3v2nq$1@dont-email.me>
References: <vue615$2b0tt$1@dont-email.me>
 <3e78dbf087fceab8acc676da77f7f36f@www.novabbs.org>
 <vuev2e$33r49$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 25 Apr 2025 14:36:49 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="ae59ea9238ec9489f54ffee451ba800f";
	logging-data="4164346"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+unpamNffs9I5hEHaENoVxuKZnUTfg5Zs="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:JXaBOk7iwND/AbtmvuCmCwdBFTg=
Content-Language: en-US
In-Reply-To: <vuev2e$33r49$1@dont-email.me>
Bytes: 10595

On 2025-04-24 11:17 p.m., BGB wrote:
> On 4/24/2025 6:00 PM, MitchAlsup1 wrote:
>> On Thu, 24 Apr 2025 20:10:29 +0000, BGB wrote:
>>
>> ----------
>>>
>>>
>>> But, a recent line of fiddling has gone in an odd direction.
>>>    I felt a need for a basic script language for some tasks;
>>
>> I have been doing something similar--except I wrote my script
>> translator in eXcel.
>>
> 
> Errm?...
> 
>> I use it to read *.h files and spit out *.c files that translate
>> type mathfunction(type arguments) into a series of lines Brian's
>> compiler interprets as transcendental instructions in My 66000
>> ISA.
>>
>> So, code contains the prototype::
>>
>> extern type_r recognized_name( type_a1 name );
>>
>> and my scripter punts out::
>>
>> type_r recognized_spelling( type_a name )
>> {
>>      register typoe_a __asm__("R1") = name;
>>      __asm__("instruction_spelling\t%4,%4" : "=r" (R1) : "r" (r1) );
>>      return R1;
>> }
>>
>> So when user codes (in visibility of math.h)
>>
>>      y = sinpi( x );
>>
>> compiler spits out:
>>
>>      SINPI    Ry,Rx
> 
> OK.
> 
> 
> 
> In my case, the script interpreter is written in C.
>    Code needed to get the core interpreter working: Around 1000 lines;
>    Code needed after adding more stuff, around 1500 lines.
> Though, not counting the ~ 600 lines needed mostly for the dynamic type- 
> system and similar.
> 
> A vaguely similar design was implemented inside the TestKern shell, but 
> was written to make use of BGBCC extensions. For this case, needed to 
> write something that would also work in MSVC and GCC. Core design for 
> the dialect was still similar though (and chose BASIC as a base partly 
> becuase I already knew I could get something that was fairly usable with 
> comparably small code).
> 
> 
> Language sort of looks like:
>    //comment, contents entirely ignored by parser
>    rem stuff  //also comment, but subject to token rules
>    x=a+b      //basic assignment
>    let x=a+b  //also assignment, creates vars in global scope
>    temp y=a+b //similar to let, but dynamically scoped
>    x=a*b+c*d  //does compound statements with a normal-ish precedence.
>    x=12345    //integer literal, decimal
>    x=0x1234   //hexadecimal
>    x="string"  //string, uses C style escapes
>    dim a(128)  //creates a global array
>    if x<10 goto label
>    label:
>    goto label   //goto
>    gosub label  //subroutine call to label
>    return       //return from most recent gosub
>    end          //script terminates
>    print stuff  //print stuff to console
>    x=arr(i)     //load from array
>    arr(i)=x     //store to array
> 
> Atypical stuff:
>    Dynamically typed;
>      Traditional BASIC used suffixes to encode type.
>      With no-suffix typically for a default numeric type.
>      QBasic and Visual Basic using static types.
>    Dynamically scoped;
>      Like Emacs Lisp.
>      Callee can see variables in the caller;
>      Variables can be created that do not effect caller.
>    ...
> 
> Atypical syntax:
>    x = gosub label a=3, b=4  //gosub with return values and parameters.
>    return expr    //return with expression
>    v=(vec 1,2,3)  //vector type
>    m=(vec (vec 1,0,0),(vec 0,1,0),(vec 0,0,1)) //poor man's matrix
>    ...
> 
> Precendence:
>    Literal values;
>    Unary operators (+, -, !, ~)
>    *, /, %
>    +, -
>    &, |, ^
>    <<, >>
>    == (=), !=, <, >, <=, =>
>    &&, ||
> 
> No assignment or comma operators; assignment is a statement.
> Precedence rules differ here from C.
> 
> Unlike a C style tokenizer, any combination of operator symbols will be 
> parsed as a single operator, regardless of whether or not such an 
> operator exists (this shaves a big chunk of code off the tokenizer logic).
> 
> For now, the language lacks any ability to define proper functions in- 
> language, so the only functions that exist are built-in.
> 
> 
> For the first time in a very long time, this interpreter has an "eval" 
> command in the console. Though, one needs to use parenthesis to eval an 
> expression as (unlike JS or similar) statements and expressions are 
> different and one may not have an implicit expression in a statement 
> context. For my first major script language (JS based), there was an 
> eval. Howerver, with the design of my later BS2 language, eval was no 
> longer viable.
> 
> 
> Where, there is a split between design choices that make sense for a 
> light-duty script language, and one meant for "serious work" (more 
> features, better performance, etc). Sometimes, one might climb the 
> ladder of the language being better for implementation tasks, while 
> ignoring things that are useful for light-duty scripting tasks (trying 
> to make a language that does both but maybe ultimately does neither task 
> particularly well).
> 
> So, say, the fate that befell my original BGBScript language, was that 
> the VM became increasingly heavyweight (more code, more complex, ...) 
> and less well suited for implementation tasks (as it tried to take on 
> work that might have otherwise been left to C). BGBScript2 had 
> essentially turned into a Java like language, not as good at 
> implementing stuff as "just write everything in C", yet no longer great 
> for scripting either (namely, Java-style code structure is not 
> particularly amendable to interactive use of "eval"; nor is a 
> statically-typed language particularly amendable to "hot patching" live 
> code, etc...).
> 
> Like, when a scripting VM expands to 300 or 500 kLOC, using it for 
> scripting a project is no longer as attractive of an option. A partial 
> fork of this VM still survives though, I just now call it "BGBCC" and am 
> using it mostly as a C compiler for my custom ISA project.
> 
> Though, from what I can see, modern JavaScript seems headed down a 
> similar path.
> 
> A similar issue seems common in many long lived script VM projects. They 
> get faster and more powerful, all while loosing the properties that made 
> them useful for their original use cases.
> 
> 
> 
> Granted, the other option is to effectively "roll the clock backwards", 
> and revert a language to a simpler form.
> 
> Judging by the past, could probably do another JavaScript style VM in 
> around 10k LOC or so. Maybe less if the design priority is keeping code 
> small. Besides the block structuring, there are "gotcha" things like 
> break/continue handling that one needs to deal with. Naive AST-walking 
> interpreters don't deal well with non-local control transfers (like 
> break/continue/goto).
> 
> So, say, if the minimum becomes:
>    Parse language to AST;
>    Flatten AST into some sort of linear IR;
>    Interpret linear IR.
> Then this would set a lower limit on the size of the interpreter.
========== REMAINDER OF ARTICLE TRUNCATED ==========