Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Terje Mathisen Newsgroups: comp.arch Subject: Re: text in programming languages, Unicode in strings Date: Mon, 20 May 2024 21:33:15 +0200 Organization: A noiseless patient Spider Lines: 43 Message-ID: References: <2024May20.145316@mips.complang.tuwien.ac.at> <2024May20.192403@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Mon, 20 May 2024 21:33:16 +0200 (CEST) Injection-Info: dont-email.me; posting-host="2036112f32ea4bd7b1c4fb6c85600f70"; logging-data="165436"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18FLePfDg77v0liZ57nJvFMkPC+mwkdJZC8qNRHfhToDg==" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2 Cancel-Lock: sha1:7pE2NP9etiFp6vNSM7vrjURrT6Q= In-Reply-To: <2024May20.192403@mips.complang.tuwien.ac.at> Bytes: 2954 Anton Ertl wrote: > jgd@cix.co.uk (John Dallman) writes: >> In article <2024May20.145316@mips.complang.tuwien.ac.at>, >> anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote: >> >>> I am not convinced that the locale-specific input is a good idea, >>> though. >> >> You look pretty silly if your input function can't read the products of >> your output function, and figuring out what separators have been used >> automatically is not foolproof. > > Yes and yes. Especially given the "," vs. "." roles in various > locales. > > But OTOH, not being able to read or, worse, misinterpreting the output > produced by someone else just because that output was produced under a > different locale is pretty silly, too. > > For reserved words and builtin names of programming languages, the > solution has been to make them independent of the locale and ignore > Algol 60 and Algol 68 for programming, which suggested something else. > > We already do the same for the decimal separator in the usual output > functions (it uses "."), we should introduce thousands separators that > are also locale-independent. Yeah, this is one of those misfeatures with no good solution. Hydro, with 130 operating countries (factories in 70+ of them) had lots of issues with programs that insisted on producing output (or reading input) in whatever their current locale/country specified, vs those that would always use US (or even worse: Norwegian) rules. I've personally written perl scripts to parse/inspect financial consolidation reports, figure out the locale rules used and then convert to the company standard. Terje -- - "almost all programming can be viewed as an exercise in caching"