| Deutsch English Français Italiano |
|
<vrgtej$a85$1@news.muc.de> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!news.swapon.de!news.in-chemnitz.de!news2.arglkargh.de!news.karotte.org!news.space.net!news.muc.de!.POSTED.news.muc.de!not-for-mail
From: Alan Mackenzie <acm@muc.de>
Newsgroups: comp.theory
Subject: Re: neos Universal Compiler
Date: Thu, 20 Mar 2025 11:16:35 -0000 (UTC)
Organization: muc.de e.V.
Message-ID: <vrgtej$a85$1@news.muc.de>
References: <UX6BP.512735$Kb9a.408485@fx16.ams4> <vr3ir8$39jtg$1@dont-email.me> <3KgBP.513160$Kb9a.94584@fx16.ams4> <vr6948$1jnlo$1@dont-email.me> <KWEBP.424367$NN2a.82900@fx15.ams4> <vr8pi1$3r8lv$1@dont-email.me> <NrYBP.191725$C8m7.108955@fx11.ams4> <vrbu8h$2km30$1@dont-email.me> <S7fCP.743699$nb1.627959@fx01.ams4> <vre64b$lcu3$1@dont-email.me> <vre88p$7kb$1@news.muc.de> <vreh3f$v5ih$1@dont-email.me>
Injection-Date: Thu, 20 Mar 2025 11:16:35 -0000 (UTC)
Injection-Info: news.muc.de; posting-host="news.muc.de:2001:608:1000::2";
logging-data="10501"; mail-complaints-to="news-admin@muc.de"
User-Agent: tin/2.6.4-20241224 ("Helmsdale") (FreeBSD/14.2-RELEASE-p1 (amd64))
Mikko <mikko.levanto@iki.fi> wrote:
> On 2025-03-19 11:02:49 +0000, Alan Mackenzie said:
>> Mikko <mikko.levanto@iki.fi> wrote:
>>> On 2025-03-18 14:08:50 +0000, Mr Flibble said:
>>>> On Tue, 18 Mar 2025 15:59:45 +0200, Mikko wrote:
>> [ .... ]
>>>>> Is there a neosBNF schema that describes the tokens of FORtRAN 66 or
>>>>> Algol 60?
>>>> Not yet.
>>> The definition of string literal of Algol 60 would be a good example
>>> of something that cannot be defined with a regular expression and is
>>> therefore impossible or at least complicated with an ordinary tokenizer.
>> Would you please be more specific about just what in an Algol 60 string
>> literal prevents a regexp from parsing it. Not for any special reason,
>> just that I'm curious. Maybe an example of such a string would be
>> interesting. Thanks!
> Algol 60 has different characters for opening and closing quotes (something
> like 2018 and 2019 of Unicode) ....
Most current languages, including C, have different openers and closers
for comments, which is surely analogous.
> .... and allows any number of nested quotes.
Ah OK. Regular expressions can't parse arbitrarily nested structures.
But Backus-Nauer Form can express them, and a push-down automaton can
process them.
Are you sure about ordinary tokenizers not being able to handle such
arbitrarily nested things in a non-complicated way?
> --
> Mikko
--
Alan Mackenzie (Nuremberg, Germany).