Deutsch English Français Italiano |
<v30slp$cvt$1@gal.iecc.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.iecc.com!.POSTED.news.iecc.com!not-for-mail From: John Levine <johnl@taugh.com> Newsgroups: comp.arch Subject: Re: Byte Addressability And Beyond Date: Mon, 27 May 2024 02:54:49 -0000 (UTC) Organization: Taughannock Networks Message-ID: <v30slp$cvt$1@gal.iecc.com> References: <v0s17o$2okf4$2@dont-email.me> <2024May11.173149@mips.complang.tuwien.ac.at> <v1o7i8$24m7i$1@dont-email.me> <v30mqu$3min8$5@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Date: Mon, 27 May 2024 02:54:49 -0000 (UTC) Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="13309"; mail-complaints-to="abuse@iecc.com" In-Reply-To: <v0s17o$2okf4$2@dont-email.me> <2024May11.173149@mips.complang.tuwien.ac.at> <v1o7i8$24m7i$1@dont-email.me> <v30mqu$3min8$5@dont-email.me> Cleverness: some X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: johnl@iecc.com (John Levine) Bytes: 2163 Lines: 27 It appears that Lawrence D'Oliveiro <ldo@nz.invalid> said: >On Sat, 11 May 2024 18:49:12 +0200, David Brown wrote: > >> People often think it is easier to do string manipulation - joining, >> splitting, replacing, etc., - when you have fixed size units per >> character. > >It is easy enough to come up with a fixed-size representation for >characters in Python (or other suitably powerful language), where >“character” = “non-combining code point plus all immediately-following >combining code points”. I have to ask, how much storage do each of these fixed-size character things take? How do you know? I've been poking at Unicode for a while and I don't have the faintest idea, particularly if you include groups of emoji with ZWJ that are rendered as one image, as in this ever increasing list. Groups can have 9 code points, maybe more: https://www.unicode.org/emoji/charts/emoji-zwj-sequences.html -- Regards, John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies", Please consider the environment before reading this e-mail. https://jl.ly