Deutsch English Français Italiano |
<v12966$euk9$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: James Harris <james.harris.1@gmail.com> Newsgroups: comp.lang.misc Subject: Escapes (was String-Based Macro Systems) Date: Fri, 3 May 2024 10:01:57 +0100 Organization: A noiseless patient Spider Lines: 97 Message-ID: <v12966$euk9$1@dont-email.me> References: <uvcqn3$2pju0$1@dont-email.me> <pan$5d552$88135a24$abac79c8$4af1fb67@invalid.invalid> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Fri, 03 May 2024 11:01:59 +0200 (CEST) Injection-Info: dont-email.me; posting-host="dd321ca2c5d88d785ae2cfb2ce34d280"; logging-data="490121"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/H2Cjrrov6v9a/oqvAL59Sy4+a/fiFEm4=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:zE7P+Db3IVBv75AbThx37itoA6A= In-Reply-To: <pan$5d552$88135a24$abac79c8$4af1fb67@invalid.invalid> Content-Language: en-GB Bytes: 4760 On 13/04/2024 06:09, Blue-Maned_Hawk wrote: > Lawrence D'Oliveiro wrote: > >> The big difference with m4 is that it does away with these special >> symbols; the mere occurrence of a name matching a defined macro (or an >> argument of the macro currently being expanded) is sufficient to trigger >> substitution. Do you think this is a good idea? >> >> There are all kinds of pitfalls with such macro systems. The original >> Macrogenerator could not cope with substitutions containing unpaired “< >> ... >” quote symbols, and even GNU m4 lacks something as simple as a >> backslash-style “escape next single character, whatever it is”. While m4 >> lets you switch the quoting symbols, it still insists that they occur in >> pairs. >> >> Would adding such an escape character be useful? > > Yes, of course. > > Whenever a system has a system to escape symbols, there are two ways to go > about it: either the symbol is magic by default, and the escape makes it > normal, or the symbol is normal by default, and the escape makes it magic. > > Having both of the systems at once is generally confusing, because it > makes it difficult to remember which symbols are which. It's more > practical to have all of them be one or the other. > > One could say that having the symbols only become magic upon escapement is > better, because it clearly indicates when a symbol has magic properties. > This is analogous to the logic used to defend sigils, a form of > disambiguation repeatedly found to be pointless because names already do > that disambiguation. Therefore, the correct choice is magic by default. Interesting points though I am not sure how you got to that conclusion (or what you mean by "the logic used to defend sigils"). In particular, magic characters are sometimes used in contexts in which there are no "names" with which to do any disambiguation. For example, the regular expression to match "parts" and "party" might be "part[sy]" I presume you would take that as magic-by-default so any occurrence of a magic symbol needs to be escaped as in a[b] appearing as "a\[b\]" Alternatively, if magic symbols were prefixed with ~ then the above two strings would appear as "part~[xy~]" "a[b]" Is that really worse? > > One fallacious argument i've heard used to justify magic by default is > that it means that the treatment of the escape symbol itself is consistent > with all the other symbols in that it's magic by default unless escaped by > itself. I consider this fallacious because in a system where magic must > be explicit, the escape symbol would be the _only_ exception, and it would > be _impossible_ to make any others—what i'd say is a worthwhile sacrifice. Indeed. In C, backslash does double duty \n - backslash /gives/ significance to n \" - backslash /removes/ the significance of the double quote That inconsistency does seem odd. > > Either way, figuring out the solution to the problem of “Magic: by > default or by request?” is almost certainly a lower priority than the > majority of other problems. It's an important issue nonetheless. And aren't there two contexts, as follows? (1) A string which has to be converted by the compiler into binary encodings, e.g. "Hello\nWorld" (2) A string which /after any conversions/ means something to a function which processes it, e.g. "Hello\:space:world" where "\:space:" is meant to indicate whitespace to some program which processes the string and implies that the backslash has to remain in the encoded string. -- James Harris