Article <vhtrb4$1tms9$1@dont-email.me>

Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <vhtrb4$1tms9$1@dont-email.me>

Deutsch English Français Italiano

<vhtrb4$1tms9$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder9.news.weretis.net!news.quux.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Ed Morton <mortonspam@gmail.com>
Newsgroups: comp.unix.shell,comp.unix.programmer,comp.lang.misc
Subject: Re: Command Languages Versus Programming Languages
Date: Sat, 23 Nov 2024 18:17:41 -0600
Organization: A noiseless patient Spider
Lines: 97
Message-ID: <vhtrb4$1tms9$1@dont-email.me>
References: <uu54la$3su5b$6@dont-email.me> <87edbtz43p.fsf@tudado.org>
 <0d2cnVzOmbD6f4z7nZ2dnZfqnPudnZ2d@brightview.co.uk>
 <uusur7$2hm6p$1@dont-email.me> <vdf096$2c9hb$8@dont-email.me>
 <87a5fdj7f2.fsf@doppelsaurus.mobileactivedefense.com>
 <ve83q2$33dfe$1@dont-email.me> <vgsbrv$sko5$1@dont-email.me>
 <vgtslt$16754$1@dont-email.me> <86frnmmxp7.fsf@red.stonehenge.com>
 <vhk65t$o5i$1@dont-email.me> <vhki79$2pho$1@dont-email.me>
 <vhl0m3$5mu9$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 24 Nov 2024 01:17:44 +0100 (CET)
Injection-Info: dont-email.me; posting-host="8ae094dd2eff3c26a0f410117ac0acbd";
	logging-data="2022281"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19h1gPbb/k9ak2iUuw6IEXM"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:8wZZCU6pfSuVc/gANXhxY2jcFo4=
X-Antivirus: Avast (VPS 241123-4, 11/23/2024), Outbound message
X-Antivirus-Status: Clean
In-Reply-To: <vhl0m3$5mu9$1@dont-email.me>
Content-Language: en-US
Bytes: 5392

On 11/20/2024 9:53 AM, Janis Papanagnou wrote:
> On 20.11.2024 12:46, Ed Morton wrote:
>>
>> Definitely. The most relevant statement about regexps is this:
>>
>>> Some people, when confronted with a problem, think "I know, I'll use
>>> regular expressions." Now they have two problems.
> 
> (Worth a scribbling on a WC wall.)
> 
>>
>> Obviously regexps are very useful and commonplace but if you find you
>> have to use some online site or other tools to help you write/understand
>> one or just generally need more than a couple of minutes to
>> write/understand it then it's time to back off and figure out a better
>> way to write your code for the sake of whoever has to read it 6 months
>> later (and usually for robustness too as it's hard to be sure all rainy
>> day cases are handled correctly in a lengthy and/or complicated regexp).
> 
> Regexps are nothing for newbies.
> 
> The inherent fine thing with Regexps is that you can incrementally
> compose them[*].[**]
> 
> It seems you haven't found a sensible way to work with them?
> (And I'm really astonished about that since I know you worked with
> Regexps for years if not decades.)

I have no problem working with regexps, I just don't write lengthy or 
complicated regexps, just brief, simple BREs or EREs, and I don't 
restrict myself to trying to solve problems with a single regexp.

> In those cases where Regexps *are* the tool for a specific task -
> I don't expect you to use them where they are inappropriate?! -

Right, I don't, but I see many people using them for tasks that could be 
done more clearly and robustly if not done with a single regexp.

> what would be the better solution[***] then?

It all depends on the problem. For example, if you need to match an 
input string that must contain each of a, b, and c in any order then you 
could do that in awk with this regexp or similar:

     awk '/(a.*(b.*c|c.*b))|(b.*(a.*c|c.*a))|(c.*(a.*b|b.*a))/'

or you could do it with this condition comprised of regexp segments:

     awk '/a/ && /b/ && /c/'

I would prefer the second solution as it's more concise and easier to 
enhance (try adding "and d" to both).

As another example, someone on StackOverflow recently said they had 
written the following regexp to isolate the last string before a set of 
parens in a line that contains multiple such strings, some of them 
nested, and they said it works in python:

^(?:^[^(]+\([^)]+\) 
\(([^(]+)\([^)]+\)\))|[^(]+\(([^(]+)\([^)]+\),\s([^\(]+)\([^)]+\)\s\([^\)]+\)\)|(?:(?:.*?)\((.*?)\(.*?\)\))|(?:[^(]+\(([^)]+)\))$

I personally wouldn't consider anything remotely as lengthy or 
complicated as that regexp despite their assurances that it works, I'd 
use this any-awk script or similar instead:

{
     rec = $0
     while ( match(rec, /\([^()]*)/) ) {
         tgt = substr($0,RSTART+1,RLENGTH-2)
         rec = substr(rec,1,RSTART-1) RS substr(rec,RSTART+1,RLENGTH-2) 
RS substr(rec,RSTART+RLENGTH)
     }
     gsub(/ *\([^()]*) */, "", tgt)
     print tgt
}

It's a bit more code but, unlike that regexp, anyone assigned to 
maintain this code in future can tell what it does with just a little 
thought (and maybe adding a debugging print in the loop if they aren't 
very familiar with awk), can then be sure it does what is required and 
nothing else, and could easily maintain/enhance it if necessary.

     Ed.

> 
> Janis
> 
> [*] Like the corresponding FSMs.
> 
> [**] And you can also decompose them if they are merged in a huge
> expression, too large for you to grasp it. (BTW, I'm doing such
> decompositions also with other expressions in program code that
> are too bulky.)
> 
> [***] Can you answer the question that another poster failed to do?
>