Deutsch English Français Italiano |
<vhqebq$c71$1@reader2.panix.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail From: cross@spitfire.i.gajendra.net (Dan Cross) Newsgroups: comp.unix.shell,comp.unix.programmer,comp.lang.misc Subject: Re: Command Languages Versus Programming Languages Date: Fri, 22 Nov 2024 17:17:46 -0000 (UTC) Organization: PANIX Public Access Internet and UNIX, NYC Message-ID: <vhqebq$c71$1@reader2.panix.com> References: <uu54la$3su5b$6@dont-email.me> <874j40sk01.fsf@doppelsaurus.mobileactivedefense.com> <vhq11q$nq7$1@reader2.panix.com> <877c8vtgx6.fsf@doppelsaurus.mobileactivedefense.com> Injection-Date: Fri, 22 Nov 2024 17:17:46 -0000 (UTC) Injection-Info: reader2.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80"; logging-data="12513"; mail-complaints-to="abuse@panix.com" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: cross@spitfire.i.gajendra.net (Dan Cross) Bytes: 4796 Lines: 117 In article <877c8vtgx6.fsf@doppelsaurus.mobileactivedefense.com>, Rainer Weikusat <rweikusat@talktalk.net> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes: >> Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes: >>>> [snip] >>>> It's also not exactly right. `[0-9]+` would match one or more >>>> characters; this possibly matches 0 (ie, if `p` pointed to >>>> something that wasn't a digit). >>> >>>The regex won't match any digits if there aren't any. In this case, the >>>match will fail. I didn't include the code for handling that because it >>>seemed pretty pointless for the example. >> >> That's rather the point though, isn't it? The program snippet >> (modulo the promotion to signed int via the "usual arithmetic >> conversions" before the subtraction and comparison giving you >> unexpected values; nothing to do with whether `char` is signed >> or not) is a snippet that advances a pointer while it points to >> a digit, starting at the current pointer position; that is, it >> just increments a pointer over a run of digits. > >That's the core part of matching someting equivalent to the regex [0-9]+ >and the only part of it is which is at least remotely interesting. Not really, no. The interesting thing in this case appears to be knowing whether or not the match succeeded, but you omited that part. >> But that's not the same as a regex matcher, which has a semantic >> notion of success or failure. I could run your snippet against >> a string such as, say, "ZZZZZZ" and it would "succeed" just as >> it would against an empty string or a string of one or more >> digits. > >Why do you believe that p being equivalent to the starting position >would be considered a "successful match", considering that this >obviously doesn't make any sense? Because absent any surrounding context, there's no indication that the source is even saved. You'll note that I did mention that as a means to differentiate later on, but that's not the snippet you posted. >[...] > >> By the way, something that _would_ match `^[0-9]+$` might be: > >[too much code] > >Something which would match [0-9]+ in its first argument (if any) would >be: > >#include "string.h" >#include "stdlib.h" > >int main(int argc, char **argv) >{ > char *p; > unsigned c; > > p = argv[1]; > if (!p) exit(1); > while (c = *p, c && c - '0' > 10) ++p; > if (!c) exit(1); > return 0; >} > >but that's 14 lines of text, 13 of which have absolutely no relation to >the problem of recognizing a digit. This is wrong in many ways. Did you actually test that program? First of all, why `"string.h"` and not `<string.h>`? Ok, that's not technically an error, but it's certainly unconventional, and raises questions that are ultimately a distraction. Second, suppose that `argc==0` (yes, this can happen under POSIX). Third, the loop: why `> 10`? Don't you mean `< 10`? You are trying to match digits, not non-digits. Fourth, you exit with failure (`exit(1)`) if `!p` *and* if `!c` at the end, but `!c` there means you've reached the end of the string; which should be success. Fifth and finally, you `return 0;` which is EXIT_SUCCESS, in the failure case. Compare: #include <regex.h> #include <stddef.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { regex_t reprog; int ret; if (argc != 2) { fprintf(stderr, "Usage: regexp pattern\n"); return(EXIT_FAILURE); } (void)regcomp(&reprog, "^[0-9]+$", REG_EXTENDED | REG_NOSUB); ret = regexec(&reprog, argv[1], 0, NULL, 0); regfree(&reprog); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } This is only marginally longer, but is correct. - Dan C.