Deutsch   English   Français   Italiano  
<vpcidf$3unkp$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.lang.c
Subject: Re: Simple string conversion from UCS2 to ISO8859-1
Date: Sat, 22 Feb 2025 14:11:11 +0100
Organization: A noiseless patient Spider
Lines: 52
Message-ID: <vpcidf$3unkp$1@dont-email.me>
References: <vp9oml$3a0k5$1@dont-email.me>
 <7bf2c66d1f1ef9e92c00f44320bb998f3cea2183@i2pn2.org>
 <vp9sb4$3a0k4$5@dont-email.me> <vp9tnr$3dca2$1@dont-email.me>
 <87frk7m6h5.fsf@nosuchdomain.example.com> <vpav4f$3jdl6$1@dont-email.me>
 <vpccfn$3to51$1@dont-email.me> <vpceto$3uccs$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Feb 2025 14:11:14 +0100 (CET)
Injection-Info: dont-email.me; posting-host="c7fa0a28977b5f488f5523ebf65c845d";
	logging-data="4152985"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+1vAO1MOh2oaZNXcTDYbZUtZhaMC4Tsk8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:fQPIpPyTctT5I6oD/sRvF2HlyKQ=
In-Reply-To: <vpceto$3uccs$1@dont-email.me>
Content-Language: en-GB
Bytes: 3150

On 22/02/2025 13:11, Janis Papanagnou wrote:
> On 22.02.2025 12:29, David Brown wrote:
>>
>> As the OP explained in a reply to one of my posts, he is getting data in
>> in UCS-2 format from SMS's from a modem.  [...]
> 
> (Yes. I wrote: "have got clear after a subsequent post".)
> 
>>
>> Whether Latin-1 or Latin-9 is better will depend on his application.
> 
> (Was also my stance upthread; "If that is possible for your context")
> 
>> The
>> additional characters in Latin-9, with the exception of the Euro symbol,
>> are pretty obscure
> 
> ISTR they are some language specific symbols, so probably less obscure
> to someone from those countries.
> 

The point (as I said below) is that adding these letters (š, ž, œ) makes 
very little difference to anyone because they are not enough to let them 
write their language properly.  Sure, someone writing Czech might have 
regular use of the letter ž - but with Latin-9 they can't write the 
letters ť, ř, ď or several other Czech letters.  So it provides little 
benefit to most people who have those letters in their alphabet.  If you 
want to let people write their languages properly (something I strongly 
support), you need much fuller Unicode support - unless you are working 
specifically with Sami, Finnish or Estonian, the only benefit of moving 
from Latin-1 to Latin-9 is for the Euro symbol.


>> - it's unlikely that you'd need them and not need a
>> good deal more other characters (i.e., supporting much more of Unicode).




> 
>> As for why not use UTF-8, the answer is clearly simplicity.
> 
> This was not my point (someone else suggested that). 

<snip>

> You should address that to the other poster. :-)
> 

I was making a single reply that covered both parts - I know you didn't 
write the bits you quoted from further up-thread.