Deutsch   English   Français   Italiano  
<vrgtej$a85$1@news.muc.de>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!news.swapon.de!news.in-chemnitz.de!news2.arglkargh.de!news.karotte.org!news.space.net!news.muc.de!.POSTED.news.muc.de!not-for-mail
From: Alan Mackenzie <acm@muc.de>
Newsgroups: comp.theory
Subject: Re: neos Universal Compiler
Date: Thu, 20 Mar 2025 11:16:35 -0000 (UTC)
Organization: muc.de e.V.
Message-ID: <vrgtej$a85$1@news.muc.de>
References: <UX6BP.512735$Kb9a.408485@fx16.ams4> <vr3ir8$39jtg$1@dont-email.me> <3KgBP.513160$Kb9a.94584@fx16.ams4> <vr6948$1jnlo$1@dont-email.me> <KWEBP.424367$NN2a.82900@fx15.ams4> <vr8pi1$3r8lv$1@dont-email.me> <NrYBP.191725$C8m7.108955@fx11.ams4> <vrbu8h$2km30$1@dont-email.me> <S7fCP.743699$nb1.627959@fx01.ams4> <vre64b$lcu3$1@dont-email.me> <vre88p$7kb$1@news.muc.de> <vreh3f$v5ih$1@dont-email.me>
Injection-Date: Thu, 20 Mar 2025 11:16:35 -0000 (UTC)
Injection-Info: news.muc.de; posting-host="news.muc.de:2001:608:1000::2";
	logging-data="10501"; mail-complaints-to="news-admin@muc.de"
User-Agent: tin/2.6.4-20241224 ("Helmsdale") (FreeBSD/14.2-RELEASE-p1 (amd64))

Mikko <mikko.levanto@iki.fi> wrote:
> On 2025-03-19 11:02:49 +0000, Alan Mackenzie said:

>> Mikko <mikko.levanto@iki.fi> wrote:
>>> On 2025-03-18 14:08:50 +0000, Mr Flibble said:

>>>> On Tue, 18 Mar 2025 15:59:45 +0200, Mikko wrote:

>> [ .... ]

>>>>> Is there a neosBNF schema that describes the tokens of FORtRAN 66 or
>>>>> Algol 60?

>>>> Not yet.

>>> The definition of string literal of Algol 60 would be a good example
>>> of something that cannot be defined with a regular expression and is
>>> therefore impossible or at least complicated with an ordinary tokenizer.

>> Would you please be more specific about just what in an Algol 60 string
>> literal prevents a regexp from parsing it.  Not for any special reason,
>> just that I'm curious.  Maybe an example of such a string would be
>> interesting.  Thanks!

> Algol 60 has different characters for opening and closing quotes (something
> like 2018 and 2019 of Unicode) ....

Most current languages, including C, have different openers and closers
for comments, which is surely analogous.

> .... and allows any number of nested quotes.

Ah OK.  Regular expressions can't parse arbitrarily nested structures.
But Backus-Nauer Form can express them, and a push-down automaton can
process them.

Are you sure about ordinary tokenizers not being able to handle such
arbitrarily nested things in a non-complicated way?

> -- 
> Mikko

-- 
Alan Mackenzie (Nuremberg, Germany).