Deutsch   English   Français   Italiano  
<v8itr3$2soau$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Ed Morton <mortonspam@gmail.com>
Newsgroups: comp.unix.shell
Subject: Re: Globbing versus regular expressions
Date: Fri, 2 Aug 2024 10:26:23 -0500
Organization: A noiseless patient Spider
Lines: 58
Message-ID: <v8itr3$2soau$1@dont-email.me>
References: <87wmlf2pq9.fsf@axel-reichert.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 02 Aug 2024 17:26:28 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="e300bf01ee3e759629451260d2adad72";
	logging-data="3039582"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/g43o+ZPW3v962//QiEbgy"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:AUHrTlurLWrNs7pyYkLfa4Bq/MM=
X-Antivirus: Avast (VPS 240802-2, 8/2/2024), Outbound message
Content-Language: en-US
In-Reply-To: <87wmlf2pq9.fsf@axel-reichert.de>
X-Antivirus-Status: Clean
Bytes: 3649

On 7/21/2024 2:01 AM, Axel Reichert wrote:
> Hi all,
> 
> a colleague (new to command line wizardry) seemed puzzled by the
> existence of both globbing for file names (shell) and regular
> expressions for strings (many other command line tools).
> 
> Since I am familiar with both mechanisms for decades, I never thought
> about this "redundancy", but now I think he has a point, even more so if
> you are using the "dired" file manager in Emacs, which further blurs the
> distinction between mangling text and working on files.
> 
> Since regexes are (at quick glance) a superset of globs, why not
> consistently use the former for both file names and strings? The
> few additional keystrokes (.* instead of *) are IMHO easily compensated
> for by the more powerful capabilities of regexes.
> 
> A little reading on Wikipedia showed that both came into popular usage
> in the early 70s. So why was globbing not dropped and regexes used
> throughout? It seems that ksh93 supports "regex globbing". bash has
> "extended globbing", but this seems a clumsy, bolted-on solution. Are
> there shells out there which follow a regex-only approach (of this would
> be non-POSIX)?
> 
> Happy for any further insights (technical or historical) shed on this
> topic!
> 
> Axel

Bear in mind that a shell is an environment from which to manipulate 
(create/destroy) files and processes and sequence calls to other tools, 
not a tool to manipulate text, while regular expressions are part of the 
functionality for manipulating text in tools designed to do that. So 
while globbing (as used by shell pattern matching against file names) 
and regexps (as used by text processing tools for pattern matching 
against text strings) appear similar in functionality, they have 
different applications.

As a simple example, the `.` regexp metcaharacter matches any character, 
but it's very common in shell to have file names that start with a `.` 
(e.g. `.profile`) or contain a `.` (e.g. `stuff.txt`) so it's more 
convenient for `.` to be literal in shell pattern matching (and have `?` 
mean "any character" instead) than it is in text pattern matching.

It'd be annoying and more error-prone if every time you wanted to list 
files that start with `s` and end in `.txt` you had to write:

     ls ^s.*\.txt$

instead of just:

     ls s*.txt

So, it's just horses-for-courses - globbing patterns make filename 
matching simplest most of the time while regexps make text matching 
simplest most of the time.

     Ed.