Path: ...!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: Re: portable or =?UTF-8?B?bm90PyBWb2xhdGlsZSBzdHJpbmdz?=
Date: Wed, 14 Aug 2024 07:03:52 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 93
Message-ID: <2024Aug14.090352@mips.complang.tuwien.ac.at>
References: <nnd$3d18fe02$76aace5d@9eae9618ab09b239> <v9cfp9$364en$1@dont-email.me> <nnd$05375ad3$6e018fac@ef3f66902c87c893> <30e35d436c018c00a16f763d0c59ac34@www.novabbs.com>
Injection-Date: Wed, 14 Aug 2024 09:45:28 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="83837fc91acd085cb9f62cf33fd5a0a3";
	logging-data="404734"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+ZSl0SdfOypqig3YWSPctS"
Cancel-Lock: sha1:q9n8o8VXWiHx6eyOOlUpXp4cuAI=
X-newsreader: xrn 10.11
Bytes: 5034

mhx@iae.nl (mhx) writes:
>Having a built-in string type denoted
>by quotes might be a good thing. It is hard to find a argument
>against that when numbers are already a comparable exception that
>nobody has found a way to get rid of.

Mitch Bradley introduced parsing words for numbers, which would allow
to get rid of the number recognizers.

But yes, the mainstream is in the opposite direction: Introduce
recognizers not just for numbers, but also for strings, and a
mechanism for user-defined recognizers.

>And maybe we can then even
>tick strings and numbers like everything else?

I explored this idea in a EuroForth 2016 paper, in particular in
Section 4.1.

@InProceedings{ertl-recognizers16,
  author =       {M. Anton Ertl},
  title =        {Recognizers: Arguments and Design Decisions},
  crossref =     {euroforth16},
  pages =        {58--63},
  url =          {https://www.complang.tuwien.ac.at/papers/ertl-recognizers16.pdf},
  video =	 {https://wiki.forth-ev.de/lib/exe/fetch.php/events:recognizers.mp4},
  OPTnote =      {not refereed},
  abstract =     {The Forth text interpreter processes words and
                  numbers.  Currently the set of words can be extended
                  by programmers, but not the recognized numbers.
                  User-defined recognizers allow to extend the
                  number-recognizer part, too.  This paper shows the
                  benefits of recognizers and discusses
                  counterarguments.  It also discusses several design
                  decisions: Whether to define temporary words, or a
                  set of interpretation, compilation, and postponing
                  actions; and whether to hook the recognizers inside
                  \code{find} or in the text interpreter.}
}

@Proceedings{euroforth16,
  title = 	 {32nd EuroForth Conference},
  booktitle = 	 {32nd EuroForth Conference},
  year = 	 {2016},
  key =		 {EuroForth'16},
  url =          {https://www.complang.tuwien.ac.at/anton/euroforth/ef16/papers/proceedings.pdf}
}

The discussion has moved on since then, and getting the xt of a
recognized thing is not something that anyone (including me) found
worth the cost, so AFAIK nobody has implemented the temporary-word
approach.

However, if we adopt Gerry Jackson's attitude and make every transient
region permanent, creating a new permanent word (in a separate
section) for every parsed number, string, etc. is fine, and ticking
that word is fine, too.  For most programs, the space taken by the
recognized words is proportional to the size of the source code, which
is acceptable on desktops with GBs of RAM.  However, programs that use
EVALUATE a lot will need more recognized-word storage.  A contrived
example is:

: foo 1000000000 0 ?do s" 123" evaluate drop loop ; foo

You might imagine optimizing this by using only one definition for all
these occurences of "123", so here's another example where that
optimization would not work:

: bar 1000000000 0 ?do i 0 <# #s #> evaluate drop loop ; bar

Of course, if you make all transient regions permanent, already the <#
#s #> consumes a lot of memory in BAR, and the recognized words just
increase this by a constant factor.

While I have some sympathies for the idea of permanent "transient"
data (e.g., in Gforth S"'s intepretation semantics produce a string
that lives until the end of the session, I don't think that this would
find consensus in the standards committee.

And if you really want an execution token for a string or number,
writing

[: s" foo" ;] [: 123 ;]

inside a colon definition (or the same with :NONAME ... ; outside a
colon definition) seems easy enough.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: https://forth-standard.org/
   EuroForth 2024: https://euro.theforth.net