Deutsch English Français Italiano |
<v32lpv$1u25$1@gal.iecc.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail From: John Levine <johnl@taugh.com> Newsgroups: comp.arch Subject: Re: Byte Addressability And Beyond Date: Mon, 27 May 2024 19:09:51 -0000 (UTC) Organization: Taughannock Networks Message-ID: <v32lpv$1u25$1@gal.iecc.com> References: <v0s17o$2okf4$2@dont-email.me> <v31c4r$3u28v$1@dont-email.me> <v327n3$1use$1@gal.iecc.com> <BM25O.40665$HBac.4762@fx15.iad> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Date: Mon, 27 May 2024 19:09:51 -0000 (UTC) Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="63557"; mail-complaints-to="abuse@iecc.com" In-Reply-To: <v0s17o$2okf4$2@dont-email.me> <v31c4r$3u28v$1@dont-email.me> <v327n3$1use$1@gal.iecc.com> <BM25O.40665$HBac.4762@fx15.iad> Cleverness: some X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: johnl@iecc.com (John Levine) Bytes: 2253 Lines: 25 According to EricP <ThatWouldBeTelling@thevillage.com>: >John Levine wrote: >> If you mean an array of pointers to sequences of code points, well >> sure, but now we're back to variable length encodings. I know that I >> have no idea how big these fixed size things would have to be and i >> suspect nobody else does either. > >One could have instructions that make it easier to parse the >variable length UTF-8 sequences into codepoints. That would be the CU14 instruction on zSeries, to turn UTF-8 into UTF-32. CU41 goes the other way. >It would still have to look up whether a codepoint was combining or >stand alone. I don't see a firm definition whether combining codepoints >come before or after, after requiring a lookahead parse and so extra >checks to ensure it doesn't look past the buffer end. I think they come after but I haven't looked in enough detail. And then you have all of the issues with precomposed characters: do you normalize as you go or denormaiize, or what? -- Regards, John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies", Please consider the environment before reading this e-mail. https://jl.ly