| Deutsch English Français Italiano |
|
<vph70o$vglc$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Janis Papanagnou <janis_papanagnou+ng@hotmail.com> Newsgroups: comp.lang.c Subject: Re: Simple string conversion from UCS2 to ISO8859-1 Date: Mon, 24 Feb 2025 08:27:19 +0100 Organization: A noiseless patient Spider Lines: 43 Message-ID: <vph70o$vglc$1@dont-email.me> References: <vp9oml$3a0k5$1@dont-email.me> <7bf2c66d1f1ef9e92c00f44320bb998f3cea2183@i2pn2.org> <vp9sb4$3a0k4$5@dont-email.me> <vpbeke$3m0hp$1@dont-email.me> <vpbjqs$3qgam$1@dont-email.me> <20250222224856.891@kylheku.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Injection-Date: Mon, 24 Feb 2025 08:27:20 +0100 (CET) Injection-Info: dont-email.me; posting-host="62131a79b8c80445c238c170c558e1d4"; logging-data="1032876"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/LKKBqDRqBKI2BmmtBr65K" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 Cancel-Lock: sha1:CBO9Yw7eB3GffOTIU0zUXLiHYmE= X-Enigmail-Draft-Status: N1110 In-Reply-To: <20250222224856.891@kylheku.com> Bytes: 3074 On 23.02.2025 08:03, Kaz Kylheku wrote: > On 2025-02-22, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: >> LF: <any sequence of a single ASCII 0A or 0D, or both> >> >> It looks like they accept not only LF, CR, CR-LF, but also LF-CR. >> Is the latter of any practical relevance? > > Because if Unicode people spot the slightest opportunity to add > pointless complexity to anything, they tend to pounce on it. Given what's all collected in Unicode they've long passed the line where one more or less character would matter. ;-) That said; I anyway think it's good to have one standard instead of hundreds of individual specific character sets and "codepage" variants. > > Why just specify one line ending convention, when you can require the > processor of the file to watch out for four different tokens denoting > the line break? Well, the history is (partly) understandable. Doesn't that stem from early IT days where printers and their components got controlled by atomic commands; CR, LF, BS [*]. Sending such a text file with CR LF to the printer would perform the necessary printer raw commands.[**] I think at some early point in history they should have differentiated and standardized the file ending to use a single character. Is it now too late given that even some RFC protocol standards specify CR LF as ending sequence? Janis [*] I recall a mainframe terminal that echoed the password by sequences of <PW-char> BS 'X' BS 'O' etc. to keep it "unreadable". Of course you could see the PW by manually turning the "drum" during the print process. [**] OTOH I recall there was another control method, where the first character of a line determined the printer control; sending a file to the raw printer produces quite a mess.