Deutsch English Français Italiano |
<2024May25.181702@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.arch Subject: Re: text in programming languages, Unicode in strings Date: Sat, 25 May 2024 16:17:02 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 82 Message-ID: <2024May25.181702@mips.complang.tuwien.ac.at> References: <v0s17o$2okf4$2@dont-email.me> <v2anov$11l1$2@gal.iecc.com> <2024May19.175249@mips.complang.tuwien.ac.at> <v2df6i$3ghp4$1@dont-email.me> <v2dju2$11ed$1@gal.iecc.com> <9a6583437121418f0b8446fd6d979461@www.novabbs.org> <v2e85u$3l2k7$1@dont-email.me> <T4K2O.67485$29Ia.13673@fx13.iad> <v2fri6$2fe8$1@dont-email.me> <2024May20.182336@mips.complang.tuwien.ac.at> <v2g7ou$50do$1@dont-email.me> <2024May21.175126@mips.complang.tuwien.ac.at> <v2imur$n3i5$1@dont-email.me> Injection-Date: Sat, 25 May 2024 18:53:27 +0200 (CEST) Injection-Info: dont-email.me; posting-host="e60003ca0f70adde93fb938bfd23851e"; logging-data="3119156"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/vi7xSiLmr2seUhcoZo5eh" Cancel-Lock: sha1:PLffU//hUNvExOeHG2yN/iP+WJI= X-newsreader: xrn 10.11 Bytes: 4716 "Stephen Fuld" <SFuld@alumni.cmu.edu.invalid> writes: >Anton Ertl wrote: > >> "Stephen Fuld" <SFuld@alumni.cmu.edu.invalid> writes: [...] >> So the question is what locale the posters in this newsgroup use. I >> typically use the C.utf8 locale, because then ls sorts directories as >> Thompson intended, but which does not show thousands separator. If I >> want to show thousands separators, I usually do it with >> >> LC_NUMERIC=prog <command> >> >> or (on machines where I have not installed the prog locale): >> >> LC_NUMERIC=en_US <command> >> >> Let's see how that works for some programs: >> >> [c8:~:105615] LC_NUMERIC=prog perf stat true >> >> Performance counter stats for 'true': >> >> 0.17 msec task-clock # 0.376 CPUs >> utilized 0 context-switches # 0.000 >> K/sec 0 cpu-migrations # 0.000 >> K/sec 42 page-faults # 0.242 >> M/sec 470_561 cycles # 2.716 GHz >> 5_214 stalled-cycles-frontend # 1.11% frontend >> cycles idle 28_375 stalled-cycles-backend # >> 6.03% backend cycles idle 515_987 instructions >> # 1.10 insn per cycle >> # 0.05 stalled cycles per insn 103_096 branches >> # 595.157 M/sec 4_973 branch-misses # >> 4.82% of all branches >> >> 0.000460708 seconds time elapsed >> >> 0.000522000 seconds user >> 0.000000000 seconds sys > > >I would be happier with that (using an underscore for the thousands >separator) than with no separation. The underscore is due to my "prog" locale <https://www.complang.tuwien.ac.at/anton/locale-prog/>. If you use LC_NUMERIC=en_US, you get "," as thousands separator; if you use, e.g., LC_NUMERIC=de_AT, you get ".". If you use LC_NUMERIC=C, you get nothing. All assuming you have these locales installed. >> > I vaguely remember that in COBOL (which was defined before >> > locale were a thing), if you specified "Decimal is Comma" (I may >> > have the syntax wrong), then the decimal speparator became the >> > comma. >> >> In the source code? > > >I had tolook this up, as it has been far to long, but yes. > >https://www.ibm.com/docs/en/cobol-zos/6.3?topic=section-decimal-point-is-comma-clause Interesting. Can be seen as another unneeded feature that later programming languages did not include. >No, if you wanted to use this, you added the "Decimal point is comma" >statement in the configuration section. Note that this is obsolete, as >COBOL now supports some version of locales. But that's a different feature: decimal-point-is-comma is for the source code, while the locale is for the input and output of the resulting program. If you compile a C program with LC_NUMERIC=de_AT, the decimal separator in the C code is still ".", not what comes from the locale. But if you then run a printf or scanf with the "'" in the conversion specifier (and you are on a Unix system), you get the output according to the locale, and the input is scanned according to the locale. - anton -- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>