Deutsch   English   Français   Italiano  
<slrnvrcfcl.3e0.naddy@lorvorc.mips.inka.de>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder8.news.weretis.net!news.szaf.org!inka.de!mips.inka.de!.POSTED.localhost!not-for-mail
From: Christian Weisgerber <naddy@mips.inka.de>
Newsgroups: comp.unix.shell
Subject: Re: Sorting problem with Unix sort(1) with UTF-8 punctuation
 characters - locale issue
Date: Wed, 19 Feb 2025 20:22:45 -0000 (UTC)
Message-ID: <slrnvrcfcl.3e0.naddy@lorvorc.mips.inka.de>
References: <vp4f6o$288ui$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 19 Feb 2025 20:22:45 -0000 (UTC)
Injection-Info: lorvorc.mips.inka.de; posting-host="localhost:::1";
	logging-data="3521"; mail-complaints-to="usenet@mips.inka.de"
User-Agent: slrn/1.0.3 (FreeBSD)
Bytes: 2317
Lines: 32

On 2025-02-19, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

> If anything, I'd expected LC_COLLATE to have an effect on sorting.
> Then there's no locale with @isodate on that sort-defunct system.
> And clearing that LC_TIME locale or removing the "@isodate" part
> did not change anything; it needs that setting to a non-existing
> locale file to work correctly on the otherwise not correctly
> sorting system.

My working hypothesis would be that setting LC_TIME to a nonexistent
locale causes an error that invalidates the _whole_ locale setting
and causes a fallback to a default setting, likely the "C" locale.
You can check that sorting with LC_ALL=C or an invalid value like
LC_ALL=foobar will produce your "correct" result.

A corollary from this would be that your "sort-defunct" system uses
a different collation order than your "correctly" sorting system
for the de_DE.UTF-8 locale.

On the FreeBSD 14-STABLE system I'm typing this on, sorting your
example data with my typical C.UTF-8 locale produces your expected
result, sorting with de_DE.UTF-8 (or en_US.UTF-8) produces a different
order.

> >····**·······**················<	abc1
> >···········**······**··········<	efg2
> >·**·························**·<	hij3

Also, I have no idea what could be considered the "correct" sorting
order for this.

-- 
Christian "naddy" Weisgerber                          naddy@mips.inka.de