Article <v1ns43$2260p$1@dont-email.me>

Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <v1ns43$2260p$1@dont-email.me>

Deutsch English Français Italiano

<v1ns43$2260p$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.arch
Subject: Re: Byte Addressability And Beyond
Date: Sat, 11 May 2024 15:33:55 +0200
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <v1ns43$2260p$1@dont-email.me>
References: <v0s17o$2okf4$2@dont-email.me>
 <4e0557bec2acda4df76f1ed01ebcbdf6@www.novabbs.org>
 <v1ep4i$1ptf$1@gal.iecc.com> <v1eruj$3o1r8$1@dont-email.me>
 <v1h8l6$1ttd$1@gal.iecc.com> <v1kifk$17qh0$1@dont-email.me>
 <2024May10.182047@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 11 May 2024 15:33:56 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="e065d27d964d081eeac047b5b066e87e";
	logging-data="2168857"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+AagMqc+30UBLk1vbo/tO97uurik8wk3I="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Bz7mF3YYluh0V9m2awgfm6vIXK0=
In-Reply-To: <2024May10.182047@mips.complang.tuwien.ac.at>
Content-Language: en-GB
Bytes: 2566

On 10/05/2024 18:20, Anton Ertl wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> UTF-32 is fine for internal use, however - using whatever endianness
>> your processor prefers.  The trick is never to let it leave the one
>> computer in any encoding other than UTF-8.
> 
> An unnecessary complication.
> 
> 1) I only came up with the following use cases where you need to deal
> with individual non-ASCII characters: Palindrome checkers and anagram
> programs; I remember somebody mentioning a third use (which I promptly
> forgot), but anyway, there are few cases.
> 
> 2) But even for those few cases, UTF-32 is not good enough, because a
> code point is not a character.
> 

I agree that it is usually unnecessary to convert to UTF-32 - I am 
merely saying that /if/ you feel you want to expand the code points, 
then UTF-32 is fine for the purpose and you should not have to worry 
about endianness because you should not be moving it off your computer, 
thus native endianness is all you need.

People sometimes say they want to expand to code points to be able to 
see the length of the string in characters, or to index characters, or 
for easier splicing or joining strings.  I don't think these are 
particularly useful in practice, but UTF-32 is fine for those that want it.