Deutsch English Français Italiano |
<v33bqi$1ulm$1@gal.iecc.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail From: John Levine <johnl@taugh.com> Newsgroups: comp.arch Subject: Re: Unicode in strings Date: Tue, 28 May 2024 01:25:38 -0000 (UTC) Organization: Taughannock Networks Message-ID: <v33bqi$1ulm$1@gal.iecc.com> References: <v0s17o$2okf4$2@dont-email.me> <v31ddp$3u8om$1@dont-email.me> <v3283t$1use$2@gal.iecc.com> <v33apl$9cst$3@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Date: Tue, 28 May 2024 01:25:38 -0000 (UTC) Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="64182"; mail-complaints-to="abuse@iecc.com" In-Reply-To: <v0s17o$2okf4$2@dont-email.me> <v31ddp$3u8om$1@dont-email.me> <v3283t$1use$2@gal.iecc.com> <v33apl$9cst$3@dont-email.me> Cleverness: some X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: johnl@iecc.com (John Levine) Bytes: 2151 Lines: 26 According to Lawrence D'Oliveiro <ldo@nz.invalid>: >On Mon, 27 May 2024 15:16:13 -0000 (UTC), John Levine wrote: > >> According to Lawrence D'Oliveiro <ldo@nz.invalid>: >>>On Wed, 22 May 2024 15:38:51 -0400, Stefan Monnier wrote: >>> >>>> I don't know of any language (or even library) that supports the >>>> notion of "character" for Unicode strings. 🙁 >>> >>> Surely a “character” (or “grapheme” I think is (one of) the Unicode >>> terms) is (represented by) a non-combining code point combined with all >>> the immediately-following combining code points. >> >> Take another look at the table I referred to yesterday. When you have >> ZWJ the rules of what combines with what gets awfully complicated. > >ZWJ is classed as “punctuation”, and has no combining class. So it forms a >“character” or “grapheme” it its own right. Really, you need to look at that combined emoji table I told you about yesterday. -- Regards, John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies", Please consider the environment before reading this e-mail. https://jl.ly