Article <vabbuk$13sj5$1@dont-email.me>

Deutsch English Français Italiano
<vabbuk$13sj5$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: "Craig A. Berry" <craigberry@nospam.mac.com>
Newsgroups: comp.os.vms
Subject: Re: C and C++, promotion, stabilization, migrationFor embedded
Date: Fri, 23 Aug 2024 20:10:44 -0500
Organization: A noiseless patient Spider
Lines: 38
Message-ID: <vabbuk$13sj5$1@dont-email.me>
References: <v9kske$uqhh$2@dont-email.me> <va04hl$2viks$2@dont-email.me>
 <va08j7$30gmu$1@dont-email.me> <va22kr$3ce14$2@dont-email.me>
 <vaaa1j$v7rp$1@dont-email.me> <vab69p$1356i$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 24 Aug 2024 03:10:45 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="f44277d56155ccb1467d09cf89364eaa";
	logging-data="1176165"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18oCA8oITmYlvqOX78QmRQC0IsYEyFBydg="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:/zhI2F1lk6PlTYzcnYMyFvkdCkE=
In-Reply-To: <vab69p$1356i$1@dont-email.me>
Content-Language: en-US
Bytes: 2924


On 8/23/24 6:34 PM, Arne Vajhøj wrote:
> On 8/23/2024 11:32 AM, Stephen Hoffman wrote:
>> OpenVMS has ~no concept of languages, either. Yeah, the C abd C++ I18N 
>> giblets, Java and its own little world, maybe using the existing and 
>> older ICU or maybe you ported a newer ICU, and the deprecated Terminal 
>> Fallback Facility (TFF) and National (Replacement)  Character Set 
>> (NCS) giblets, sure. All of which make things more interesting for 
>> apps that want or need to deal with the UTF-8 and post-ASCII world.
> 
> Regarding UTF-8 support, then  my take is that:
> 
> UTF-8 in file names, in usernames, in logicals, in identifiers and in
> programs/scripts: not really needed.
> 
> UTF-8 in file content and in databases: very much needed.
> 
> And support for the latter fall in 3 groups:
> * JVM languages (Java, Groovy etc.) and I believe Python - does
>    support unicode and can read/write using any encoding including UTF-8

Perl has excellent Unicode support.  It even comes with piconv, an iconv
application implemented in Perl that can do conversions.

I'm told Unicode support in Python 2.x was pretty shaky but Python 3 is
a lot better.

> * C, C++, PHP - developer keeps track of what encoding a byte
>    sequence is in but possible to explicit convert encodings
>    (C/C++ has wchar_t but it is neither much used nor UTF-8 friendly
>    AFAIK)
> * the traditional native languages - very little support except what
>    can be done by calling C functions

and what can be done calling C functions is limited by the data in
SYS$I18N_ICONV, which is about 20 years out of date.  There have been a
dozen major releases of the Unicode standard in that time.