Deutsch   English   Français   Italiano  
<vph70o$vglc$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: Janis Papanagnou <janis_papanagnou+ng@hotmail.com>
Newsgroups: comp.lang.c
Subject: Re: Simple string conversion from UCS2 to ISO8859-1
Date: Mon, 24 Feb 2025 08:27:19 +0100
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <vph70o$vglc$1@dont-email.me>
References: <vp9oml$3a0k5$1@dont-email.me>
 <7bf2c66d1f1ef9e92c00f44320bb998f3cea2183@i2pn2.org>
 <vp9sb4$3a0k4$5@dont-email.me> <vpbeke$3m0hp$1@dont-email.me>
 <vpbjqs$3qgam$1@dont-email.me> <20250222224856.891@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 24 Feb 2025 08:27:20 +0100 (CET)
Injection-Info: dont-email.me; posting-host="62131a79b8c80445c238c170c558e1d4";
	logging-data="1032876"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/LKKBqDRqBKI2BmmtBr65K"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
Cancel-Lock: sha1:CBO9Yw7eB3GffOTIU0zUXLiHYmE=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <20250222224856.891@kylheku.com>
Bytes: 3074

On 23.02.2025 08:03, Kaz Kylheku wrote:
> On 2025-02-22, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
>> LF:	  	<any sequence of a single ASCII 0A or 0D, or both>
>>
>> It looks like they accept not only LF, CR, CR-LF, but also LF-CR.
>> Is the latter of any practical relevance?
> 
> Because if Unicode people spot the slightest opportunity to add
> pointless complexity to anything, they tend to pounce on it.

Given what's all collected in Unicode they've long passed the line
where one more or less character would matter. ;-)

That said; I anyway think it's good to have one standard instead of
hundreds of individual specific character sets and "codepage" variants.

> 
> Why just specify one line ending convention, when you can require the
> processor of the file to watch out for four different tokens denoting
> the line break?

Well, the history is (partly) understandable. Doesn't that stem from
early IT days where printers and their components got controlled by
atomic commands; CR, LF, BS [*]. Sending such a text file with CR LF
to the printer would perform the necessary printer raw commands.[**]

I think at some early point in history they should have differentiated
and standardized the file ending to use a single character.

Is it now too late given that even some RFC protocol standards specify
CR LF as ending sequence?

Janis

[*] I recall a mainframe terminal that echoed the password by sequences
of  <PW-char> BS 'X' BS 'O'  etc. to keep it "unreadable". Of course
you could see the PW by manually turning the "drum" during the print
process.

[**] OTOH I recall there was another control method, where the first
character of a line determined the printer control; sending a file to
the raw printer produces quite a mess.