Deutsch English Français Italiano |
<2024May29.095921@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.arch Subject: Re: Byte Addressability And Beyond Date: Wed, 29 May 2024 07:59:21 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 22 Message-ID: <2024May29.095921@mips.complang.tuwien.ac.at> References: <v0s17o$2okf4$2@dont-email.me> <v30mqu$3min8$5@dont-email.me> <v30slp$cvt$1@gal.iecc.com> <v31c4r$3u28v$1@dont-email.me> <v327n3$1use$1@gal.iecc.com> <v33bop$9cst$10@dont-email.me> Injection-Date: Wed, 29 May 2024 10:06:59 +0200 (CEST) Injection-Info: dont-email.me; posting-host="8dde46fdf7008275d2b3739552e1883a"; logging-data="1123671"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1978z+QFMoEf567EzbqRNJh" Cancel-Lock: sha1:J9cGHsYw6seMefGL+rpTFvKFEQI= X-newsreader: xrn 10.11 Bytes: 1889 Lawrence D'Oliveiro <ldo@nz.invalid> writes: >The fixed-size things are references to objects. Or in a lower-level >language like C, they could indeed be pointers/indexes into an array of >code points. There is no need for UTF-32 for such an approach. Just let the pointers/indexes point to the start of the character in UTF-8 represntation. >[...] we still have easy random access, and the length of the >array is the number of characters. Both of which are rarely necessary. But sure, if you need that, the approach of having an array of pointers to characters in UTF-8 representation works, while converting to UTF-32 does not help at all. - anton -- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>