Path: ...!goblin1!goblin.stu.neva.ru!eternal-september.org!feeder.eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: Ruvim <ruvim.pinka@gmail.com>
Newsgroups: comp.lang.forth
Subject: Recognizers and postponing (was: Progressing Matthias Trute's
 recognizer proposal)
Date: Mon, 29 Jun 2020 15:33:41 +0300
Organization: A noiseless patient Spider
Lines: 181
Message-ID: <rdcn35$sd2$1@dont-email.me>
References: <ramao9$3og$1@dont-email.me> <rcsns8$nr$1@dont-email.me>
 <rct7vk$58l$1@dont-email.me> <ILadnVcdXuAmgW7DnZ2dnUU78T-dnZ2d@supernews.com>
 <rcv76n$oth$1@dont-email.me> <OO2dncAEzI33uW7DnZ2dnUU78V2dnZ2d@supernews.com>
 <rcveot$9j3$1@dont-email.me> <2020Jun28.173040@mips.complang.tuwien.ac.at>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 29 Jun 2020 12:33:41 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="6048b26134cb5bd61ad2b70a54e747d1";
	logging-data="29090"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19bGXAZwVqdt2g1S1B8yQ3j"
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101
 Thunderbird/68.9.0
Cancel-Lock: sha1:WvMQ5XAiqOCAfkqbgbJkltIqXQY=
In-Reply-To: <2020Jun28.173040@mips.complang.tuwien.ac.at>
Content-Language: en-US
X-Mozilla-News-Host: news://nntp.aioe.org
Bytes: 6905

On 2020-06-28 18:30, Anton Ertl wrote:
> Ruvim <ruvim.pinka@gmail.com> writes:
>> By the Standard, in the glossary entry for POSTPONE
>> https://forth-standard.org/standard/core/POSTPONE
>>
>>    Skip leading space delimiters. Parse _name_ delimited by a space.
>>    Find _name_. Append the compilation semantics of _name_
>>    to the current definition.
>>
>>
>> If the Recognizers word set is provided, this specification can be
>> updated to something like the following:
>>
>>    Skip leading space delimiters. Parse _lexeme_ delimited by a space.
>>    Recognize _lexeme_. Append the compilation semantics for _lexeme_
>>    to the current definition.
> 
> Sounds ok (apart from the bikeshedded terminology).
> 
>> It should append compilations semantics for *the same* _lexeme_ that was
>> parsed.
>>
>> Let _lexeme_ is "foo{".   So it should append compilation semantics for
>> lexeme "foo{".   Not for lexeme "foo{ bar }" or anything else.
>>
>>
>> The same for string literals.
>> A code like
>>    POSTPONE "foo bar"
>> should be incorrect. It should not be exclusion from the general rule.
> 
> That does not make sense.  Your "recognize _lexeme_" invokes the
> string recognizer, which parses ' bar"'.

Definitions of the terms matter - https://git.io/JfhaI
So it does make sense.


to *recognize* a /lexeme/: to determine the interpretation semantics and 
the compilation semantics for the /lexeme/ in the current dynamic context.

*recognizer*: a Forth definition that tries to recognize a lexeme 
producing a fully qualified token.


Determining the semantics for a given lexeme cannot involve parsing the 
input buffer. Determining semantics can't have any side effects in 
principle.

If some recognizer violates this specification — this "recognizer" just 
is out of the scope of the specification.


If you want to append additional parsing, a proper way is an additional 
separate step after "parse lexeme delimited by a space". But the problem 
of postponing can be also solved in another way.





> It's also more useful.
> 
> : foo  POSTPONE "foo bar" POSTPONE type ; immediate
> : bar foo ;
> bar \ prints "foo bar"

Actually, it's more useful to have a proper way to compile fragments of 
code.

The phrases like

   POSTPONE foo POSTPONE bar POSTPONE baz

— are a pain.

Also, if you need to compile a single string literal or a number, you 
don't need POSTPONE at all.

   "foo bar" slit,
or
   123 lit,

are enough.





> certainly works in Gforth.
> It also makes more sense to those who know the string recognizer.

But it makes far less sense when you don't know whether a recognizer 
does parsing or not. I provided several examples of a recognizer for 
string literals that doesn't do parsing. I use a "recognizer" for the 
"p{ ... }p" construct that does parsing (and even compilation). Some 
"recognizer" may even do parsing if interpreting, and avoid parsing if 
compiling, or vice versa (and I used such one also).

NB: any tool will be used in all unexpected ways that are possible.
So we should accurately limit all the possible ways, to guarantee that 
all possible various parts can work together.


And it's a very good practical rule, that POSTPONE always extract from 
the input buffer only one space delimited lexeme and nothing more.

Let POSTPONE be a last resort for back compatibility. Let's to don't 
touch it.



>> If we want to postpone (compile) fragments of code, the right ways is
>> something like c{ ... }c construct that properly works for *any* code.
>>
>>    : postpone-my-fancy-code
>>      c{ "foo bar" ( c-addr u ) type }c
>>
>>      c{
>>          foo{ bar }  [defined] x [if] x [then]
>>      }c
>>    ;
>>
>>    : foo [ c{ "test passed" }c ] type ; foo
> 
> Gforth has ]] ... [[, and
> 
> : foo ]] "foo bar" type [[ ; immediate
> : bar foo ;
> bar
> 
> works exactly like the version using POSTPONE above.
> 
> It would take extra code to make it work differently.

Yes. But it worth it.


The following implementation for "]] ... [[" construct covers the most 
practical cases of your implementation and does not use the "postpone 
action" from a token descriptor at all! Do we really need this action?


   : xt, compile, ;

   : compile-compile-token? ( k*x td -- true | k*x td false )
     td-nt over = if drop name>compile swap lit, xt, true exit then
     token>xt? if lit, 'xt, xt, true exit then
     postpone [: compile-token postpone ;] 'xt, xt, true
     \ There is a possibility to return false
     \ for some implementation defined cases
   ;
   : next-lexeme ( -- c-addr u|0 )
     begin parse-name ?ET ( addr ) refill ?E0 drop again
   ;
   : ]]
     begin next-lexeme 2dup "[[" equals 0= while
       recognize-any dup ?nf compile-compile-token?
       0= -32 and throw
     repeat 2drop
   ;


Where

   compile-token ( i*x k*x td -- j*x )
   \ perform the compilation semantics that are determined
   \ by the given fully qualified token ( k*x td )

   recognize-any ( c-addr u -- k*x td | 0 )
   \ recognize a lexeme using the recognizer
   \ that the Forth text interpreter currently uses

========== REMAINDER OF ARTICLE TRUNCATED ==========