Deutsch English Français Italiano |
<slrnvrcfcl.3e0.naddy@lorvorc.mips.inka.de> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder8.news.weretis.net!news.szaf.org!inka.de!mips.inka.de!.POSTED.localhost!not-for-mail From: Christian Weisgerber <naddy@mips.inka.de> Newsgroups: comp.unix.shell Subject: Re: Sorting problem with Unix sort(1) with UTF-8 punctuation characters - locale issue Date: Wed, 19 Feb 2025 20:22:45 -0000 (UTC) Message-ID: <slrnvrcfcl.3e0.naddy@lorvorc.mips.inka.de> References: <vp4f6o$288ui$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Wed, 19 Feb 2025 20:22:45 -0000 (UTC) Injection-Info: lorvorc.mips.inka.de; posting-host="localhost:::1"; logging-data="3521"; mail-complaints-to="usenet@mips.inka.de" User-Agent: slrn/1.0.3 (FreeBSD) Bytes: 2317 Lines: 32 On 2025-02-19, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: > If anything, I'd expected LC_COLLATE to have an effect on sorting. > Then there's no locale with @isodate on that sort-defunct system. > And clearing that LC_TIME locale or removing the "@isodate" part > did not change anything; it needs that setting to a non-existing > locale file to work correctly on the otherwise not correctly > sorting system. My working hypothesis would be that setting LC_TIME to a nonexistent locale causes an error that invalidates the _whole_ locale setting and causes a fallback to a default setting, likely the "C" locale. You can check that sorting with LC_ALL=C or an invalid value like LC_ALL=foobar will produce your "correct" result. A corollary from this would be that your "sort-defunct" system uses a different collation order than your "correctly" sorting system for the de_DE.UTF-8 locale. On the FreeBSD 14-STABLE system I'm typing this on, sorting your example data with my typical C.UTF-8 locale produces your expected result, sorting with de_DE.UTF-8 (or en_US.UTF-8) produces a different order. > >····**·······**················< abc1 > >···········**······**··········< efg2 > >·**·························**·< hij3 Also, I have no idea what could be considered the "correct" sorting order for this. -- Christian "naddy" Weisgerber naddy@mips.inka.de