Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <vpciqb$3unkp$3@dont-email.me>
Deutsch   English   Français   Italiano  
<vpciqb$3unkp$3@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.lang.c
Subject: Re: Simple string conversion from UCS2 to ISO8859-1
Date: Sat, 22 Feb 2025 14:18:03 +0100
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <vpciqb$3unkp$3@dont-email.me>
References: <vp9oml$3a0k5$1@dont-email.me>
 <87bjuvm68v.fsf@nosuchdomain.example.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 22 Feb 2025 14:18:04 +0100 (CET)
Injection-Info: dont-email.me; posting-host="c7fa0a28977b5f488f5523ebf65c845d";
	logging-data="4152985"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19FKpHgn3vPgBK/f8CjR4AkJBJWrKV5Ddg="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:1J42ilAbqGKV/awpRtEZBEAuhlo=
In-Reply-To: <87bjuvm68v.fsf@nosuchdomain.example.com>
Content-Language: en-GB

On 21/02/2025 20:45, Keith Thompson wrote:
> pozz <pozzugno@gmail.com> writes:
>> I want to write a simple function that converts UCS2 string into ISO8859-1:
>>
>> void ucs2_to_iso8859p1(char *ucs2, size_t size);
>>
>> ucs2 is a string of type "00480065006C006C006F" for "Hello". I'm
>> passing size because ucs2 isn't null terminated.
> 
> Is the UCS-2 really represented as a sequence of ASCII hex digits?
> 
> In actual UCS-2, each character is 2 bytes.  The representation for
> "Hello" would be 10 bytes, either "\0H\0e\0l\0l\0o" or
> "H\0e\0l\0l\0o\0", depending on endianness.  (UCS-2 is a subset of
> UTF-16; the latter uses longer sequences to represent characters
> outside the Basic Multilingual Plane.)
> 

My understanding here is that the OP is getting the UCS-2 encoded string 
in from a modem, almost certainly on a serial line.  The UCS-2 encoded 
data is itself a binary sequence of 16-bit code units, and the modem 
firmware is sending those as four hex digits.  This is a very common way 
to handle transmission of binary data in such systems - there is no need 
for escapes or other complications to delimit the binary data.  I would 
expect that the entire incoming message will be comma-separated fields 
with the time and date, sender's telephone number, and so on, as well as 
the text itself as this long hex string.