Deutsch English Français Italiano |
<2024May10.182047@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.arch Subject: Re: Byte Addressability And Beyond Date: Fri, 10 May 2024 16:20:47 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 19 Message-ID: <2024May10.182047@mips.complang.tuwien.ac.at> References: <v0s17o$2okf4$2@dont-email.me> <4e0557bec2acda4df76f1ed01ebcbdf6@www.novabbs.org> <v1ep4i$1ptf$1@gal.iecc.com> <v1eruj$3o1r8$1@dont-email.me> <v1h8l6$1ttd$1@gal.iecc.com> <v1kifk$17qh0$1@dont-email.me> Injection-Date: Fri, 10 May 2024 18:35:22 +0200 (CEST) Injection-Info: dont-email.me; posting-host="a6b7bebb769b414126acdfa8fcf79bf7"; logging-data="1541228"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+KkSvCOCw5HAn+HO5leWpP" Cancel-Lock: sha1:gIQCKhicFOxRuQGhsdTmMe2bkXk= X-newsreader: xrn 10.11 Bytes: 1828 David Brown <david.brown@hesbynett.no> writes: >UTF-32 is fine for internal use, however - using whatever endianness >your processor prefers. The trick is never to let it leave the one >computer in any encoding other than UTF-8. An unnecessary complication. 1) I only came up with the following use cases where you need to deal with individual non-ASCII characters: Palindrome checkers and anagram programs; I remember somebody mentioning a third use (which I promptly forgot), but anyway, there are few cases. 2) But even for those few cases, UTF-32 is not good enough, because a code point is not a character. - anton -- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>