Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Lawrence D'Oliveiro Newsgroups: comp.lang.c Subject: Re: Simple string conversion from UCS2 to ISO8859-1 Date: Sun, 23 Feb 2025 05:53:59 -0000 (UTC) Organization: A noiseless patient Spider Lines: 21 Message-ID: References: <7bf2c66d1f1ef9e92c00f44320bb998f3cea2183@i2pn2.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Sun, 23 Feb 2025 06:54:00 +0100 (CET) Injection-Info: dont-email.me; posting-host="5b7ee2502698a967dd467c3e40ec86e1"; logging-data="418856"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19bPNj2FjXYN/MvEeTgUHTI" User-Agent: Pan/0.162 (Pokrosvk) Cancel-Lock: sha1:BcV373H4neLgKGwSrpmBnnm+cF4= Bytes: 2283 On Sat, 22 Feb 2025 05:29:14 +0100, Janis Papanagnou wrote: > It looks like they accept not only LF, CR, CR-LF, but also LF-CR. > Is the latter of any practical relevance? Not to answer the question, but just to add to it; from the XML 1.1 spec : In addition, XML 1.0 attempts to adapt to the line-end conventions of various modern operating systems, but discriminates against the conventions used on IBM and IBM-compatible mainframes. As a result, XML documents on mainframes are not plain text files according to the local conventions. XML 1.0 documents generated on mainframes must either violate the local line-end conventions, or employ otherwise unnecessary translation phases before parsing and after generation. Allowing straightforward interoperability is particularly important when data stores are shared between mainframe and non-mainframe systems (as opposed to being copied from one to the other). Therefore XML 1.1 adds NEL (#x85) to the list of line-end characters. For completeness, the Unicode line separator character, #x2028, is also supported.