Path: ...!goblin3!goblin1!goblin.stu.neva.ru!eternal-september.org!feeder.eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: Ruvim <ruvim.pinka@gmail.com>
Newsgroups: comp.lang.forth
Subject: Re: Progressing Matthias Trute's recognizer proposal
Date: Wed, 24 Jun 2020 14:51:56 +0300
Organization: A noiseless patient Spider
Lines: 132
Message-ID: <rcveot$9j3$1@dont-email.me>
References: <ramao9$3og$1@dont-email.me> <rcsns8$nr$1@dont-email.me>
 <rct7vk$58l$1@dont-email.me> <ILadnVcdXuAmgW7DnZ2dnUU78T-dnZ2d@supernews.com>
 <rcv76n$oth$1@dont-email.me> <OO2dncAEzI33uW7DnZ2dnUU78V2dnZ2d@supernews.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 24 Jun 2020 11:51:57 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="3ddf03bd41b3e580802497323c2356f8";
	logging-data="9827"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19qwiwglClYTuYyBeC40w9D"
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101
 Thunderbird/68.9.0
Cancel-Lock: sha1:ArzWC16xRtugohYXPMEKZqklWks=
In-Reply-To: <OO2dncAEzI33uW7DnZ2dnUU78V2dnZ2d@supernews.com>
Content-Language: en-US
Bytes: 5612

On 2020-06-24 13:01, aph@littlepinkcloud.invalid wrote:
> Ruvim <ruvim.pinka@gmail.com> wrote:
>> On 2020-06-24 12:28, aph@littlepinkcloud.invalid wrote:
>>> Alex McDonald <alex@rivadpm.com> wrote:
>>>>    A recognizer that requires the string to be part of SOURCE
>>>> (for instance, if it refers to or modifies >IN) must document this
>>>> requirement, otherwise the string can come from anywhere.
>>>
>>> I take it, then, that there's been no progress on making recognizers
>>> can work with e.g. LOCATE without side effects? I haven't been keeping
>>> up.
>>
>> Actually, any recognizer with side effects can be converted into a
>> recognizer without side effect.
>>
>> So the corresponding permission of side effects isn't necessary.
>>
>> I showed an example of recognizer for string literals:
>> https://git.io/JfpKX
> 
> How does it work? There's no way AFAICS to tell a recognizer that you
> don't want any side effects, you just want to know if that recognizer
> would, given the chance, recognize a word. So how do you call it?


1. Side effects of recognizers are either permitted or prohibited by the 
specification.

If they are prohibited, and a system or program provide a recognizer 
with side effects — this system/program just non standard. And we don't 
have to consider non standard programs at all.

So, it is the specification who tells a recognizer may it have side 
effect or not.

On the other hand, there is no any necessity to permit side effects for 
recognizers.


2. "recognizable?" word was suggested in the Recognizer API v4 comments 
https://vee.gg/gVXmN

In the API that I currently suggest, some "recognizable-by" word can be 
implemented as:

   : recognizable-by ( c-addr u xt-recognizer -- flag )
     [: execute dup 0<> throw ;] catch if drop 2drop true then
   ;

In assumption that recognizers are prohibited to have any side effects, 
and their results are located on the data and floating-point stacks. So 
THROW clears these results.




> "NB: POSTPONE is not applicable to a string literal containing
> blanks." is a bit worrying.

Standard POSTPONE is applicable to a Forth word only.

Recognizers *can* extend it to any space-delimited lexeme, and even to 
multiple such lexemes, or to a lexeme delimited by anything else.

But I think we should stop on a space-delimited lexeme and not further.

For example, we have a construct
   foo{ bar }
that works in compiling.

If we know that recognizers don't have side effects (and don't change or 
read SOURCE), we know that
   POSTPONE foo{
works correctly and append compilation semantics for "foo{" into the 
current definition.

But if recognizers may do additional parsing, you don't know in advance 
what is correct:
   POSTPONE foo{
or
   POSTPONE foo{ bar }

And what is compilation semantics for "foo{" in the latter case?


By the Standard, in the glossary entry for POSTPONE
https://forth-standard.org/standard/core/POSTPONE

   Skip leading space delimiters. Parse _name_ delimited by a space.
   Find _name_. Append the compilation semantics of _name_
   to the current definition.


If the Recognizers word set is provided, this specification can be 
updated to something like the following:

   Skip leading space delimiters. Parse _lexeme_ delimited by a space.
   Recognize _lexeme_. Append the compilation semantics for _lexeme_
   to the current definition.

It should append compilations semantics for *the same* _lexeme_ that was 
parsed.

Let _lexeme_ is "foo{".   So it should append compilation semantics for 
lexeme "foo{".   Not for lexeme "foo{ bar }" or anything else.


The same for string literals.
A code like
   POSTPONE "foo bar"
should be incorrect. It should not be exclusion from the general rule.


If we want to postpone (compile) fragments of code, the right ways is 
something like c{ ... }c construct that properly works for *any* code.

   : postpone-my-fancy-code
     c{ "foo bar" ( c-addr u ) type }c

     c{
         foo{ bar }  [defined] x [if] x [then]
     }c
   ;

   : foo [ c{ "test passed" }c ] type ; foo


https://github.com/ruv/forth-on-forth/blob/master/c-state.readme.txt


--
Ruvim