Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <2024May30.144735@mips.complang.tuwien.ac.at>
Deutsch   English   Français   Italiano  
<2024May30.144735@mips.complang.tuwien.ac.at>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: python text, Byte Addressability And Beyond
Date: Thu, 30 May 2024 12:47:35 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 18
Message-ID: <2024May30.144735@mips.complang.tuwien.ac.at>
References: <v0s17o$2okf4$2@dont-email.me> <2024May10.182047@mips.complang.tuwien.ac.at> <v1ns43$2260p$1@dont-email.me> <2024May11.173149@mips.complang.tuwien.ac.at> <v1ossl$1ps0$1@gal.iecc.com> <2024May12.074045@mips.complang.tuwien.ac.at> <v30mgo$3min8$3@dont-email.me> <2024May27.082033@mips.complang.tuwien.ac.at> <v31d6h$3u595$2@dont-email.me> <2024May29.102003@mips.complang.tuwien.ac.at> <v38pn8$1gsj2$9@dont-email.me>
Injection-Date: Thu, 30 May 2024 14:50:09 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="46db1f935b2b8b941d470a80861909b8";
	logging-data="1792926"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/G74R1S9ZX4TY4tWNhVhd6"
Cancel-Lock: sha1:hNMNkuQjxRWwWyTFtTbXjK1IHxE=
X-newsreader: xrn 10.11
Bytes: 2053

Lawrence D'Oliveiro <ldo@nz.invalid> writes:
>On Wed, 29 May 2024 08:20:03 GMT, Anton Ertl wrote:
>
>> In UTF-32 a character is a sequence of (32-bit) code units.
>> In UTF-8  a character is a sequence of  (8-bit) code units.
>
>The point being, there is a 1:1 correspondence between the two 
>representations of the same characters/code points. So your claim that use 
>of one is somehow a “mistake” while the other is not, is spurious.

If the data you are working on is provided in files containing UTF-8,
conversion to UTF-32 does not provide any benefits and is therefore an
unnecessary complication, and therefore a mistake.

- anton
-- 
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>