| Deutsch English Français Italiano |
|
<vp5mbv$sao$2@reader2.panix.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail From: cross@spitfire.i.gajendra.net (Dan Cross) Newsgroups: comp.unix.shell Subject: Re: Sorting problem with Unix sort(1) with UTF-8 punctuation characters - locale issue Date: Wed, 19 Feb 2025 22:35:43 -0000 (UTC) Organization: PANIX Public Access Internet and UNIX, NYC Message-ID: <vp5mbv$sao$2@reader2.panix.com> References: <vp4f6o$288ui$1@dont-email.me> Injection-Date: Wed, 19 Feb 2025 22:35:43 -0000 (UTC) Injection-Info: reader2.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80"; logging-data="29016"; mail-complaints-to="abuse@panix.com" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: cross@spitfire.i.gajendra.net (Dan Cross) Bytes: 2299 Lines: 43 In article <vp4f6o$288ui$1@dont-email.me>, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: >I've been sorting punctuation characters on one Unix system and it >did not produce the expected result. Switching to another system did >it as expected. > >The test program (it contains non-ASCII middle-dot characters) was > >sort -t $'\t' <<EOT Do you really have the '$' there? - Dan C. >One hypothesis was that it's some locale issue. So I've copied the >LC_* settings to the newer system and disabled them one by one. >Strangely, the one that was responsible for the effect was LC_TIME! > >On the correct sorting system it was defined as > LC_TIME=de_DE.UTF-8@isodate >and the one that worked improperly had > LC_TIME=de_DE.UTF-8 > >Now I'm puzzled in many ways... >If anything, I'd expected LC_COLLATE to have an effect on sorting. >Then there's no locale with @isodate on that sort-defunct system. >And clearing that LC_TIME locale or removing the "@isodate" part >did not change anything; it needs that setting to a non-existing >locale file to work correctly on the otherwise not correctly >sorting system. > >Does anyone have an idea what's going on here? > >I'm reluctant to globally set LC_TIME=de_DE.UTF-8@isodate >(since there is no file with that name in the locale directories). > >Thanks. > >Janis > >[*] Lines with additional other contents than the depicted payload >were sorted correctly.