Deutsch English Français Italiano |
<v30mgo$3min8$3@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Lawrence D'Oliveiro <ldo@nz.invalid> Newsgroups: comp.arch Subject: Re: python text, Byte Addressability And Beyond Date: Mon, 27 May 2024 01:09:44 -0000 (UTC) Organization: A noiseless patient Spider Lines: 13 Message-ID: <v30mgo$3min8$3@dont-email.me> References: <v0s17o$2okf4$2@dont-email.me> <2024May10.182047@mips.complang.tuwien.ac.at> <v1ns43$2260p$1@dont-email.me> <2024May11.173149@mips.complang.tuwien.ac.at> <v1ossl$1ps0$1@gal.iecc.com> <2024May12.074045@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Mon, 27 May 2024 03:09:45 +0200 (CEST) Injection-Info: dont-email.me; posting-host="1c25528c6c7e7a3bd6cadc9483b25fb8"; logging-data="3885800"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18qp5z/N3Kw3e8IQtADB6Vr" User-Agent: Pan/0.158 (Avdiivka; ) Cancel-Lock: sha1:5Q8AU1Scm+Y+aSVHEgXWYFKur9I= Bytes: 1693 On Sun, 12 May 2024 05:40:45 GMT, Anton Ertl wrote: > This is a nice demonstration of the unnecessary complexity that the > codepoint mistake leads to. ... > > But if they had decided to just store the data as UTF-8 and use byte > indexes and lengths in their API, and adjusted the rest of their API > accordingly, they could have avoided this complexity and > inefficiency ... But UTF-8 is just a representation of code points, not characters. So I don’t understand why one way leads to “unnecessary complexity” and the other way does not.