Deutsch   English   Français   Italiano  
<vobl46$td52$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: Janis Papanagnou <janis_papanagnou+ng@hotmail.com>
Newsgroups: comp.lang.c
Subject: Re: Buffer contents well-defined after fgets() reaches EOF ?
Date: Mon, 10 Feb 2025 02:35:01 +0100
Organization: A noiseless patient Spider
Lines: 61
Message-ID: <vobl46$td52$1@dont-email.me>
References: <vo9g74$fu8u$1@dont-email.me> <vo9hlo$g0to$1@dont-email.me>
 <vo9ki6$gib5$1@dont-email.me> <voahgv$mdud$2@dont-email.me>
 <voaovv$ocot$1@dont-email.me> <87y0yepqg1.fsf@nosuchdomain.example.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 Feb 2025 02:35:03 +0100 (CET)
Injection-Info: dont-email.me; posting-host="7ae7436750940773d2334410c5fdf1e5";
	logging-data="963746"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/CpmLXfcYnGNYeLRbgU+ZM"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
Cancel-Lock: sha1:O1yOZ36k3ckC6ap4DDId/DeIwNQ=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <87y0yepqg1.fsf@nosuchdomain.example.com>
Bytes: 3989

On 10.02.2025 01:57, Keith Thompson wrote:
> [...]
> 
> Here's (some of) what the C standard says about text streams:
> 
>     A text stream is an ordered sequence of characters composed into
>     lines, each line consisting of zero or more characters plus a
>     terminating new-line character. Whether the last line requires
>     a terminating new-line character is implementation-defined.
> 
> For an implementation that *doesn't* require a new-line on the
> last line, a stream without a trailing new-line is valid.  For an
> implementation that *does* require it, such a stream is invalid,
> and a program that attempts to process it can have undefined behavior.

This is what "C" accepts (or tolerates), yes.

Given that some folks with the aid of some fancy editors makes it
possible to suppress (or not create) the final line ending - bytes
are still expensive it seems - I suppose it's a sensible requirement
for "C" compilers to be tolerant here.

> 
> Most modern implementations don't require that trailing new-line.
> For example, `echo -n hello > hello.txt` creates a valid text file.
> Of course a C program that deals with text files can impose any
> additional restrictions its author likes.

And  cat alpha.c beta.c > gamma.c  will create inconsistent texts if
there's no line terminator on the last lines of some files.

> 
> The above describes how a text stream looks to a C program.  The
> external representation can be quite different, with transformations
> to map between them. 

(Concerning this thread; I'm anyway operating on custom data files
in plain text format, so I'm less concerned about how "C" compilers
expect their "C" source.)

> The most common such transformation is
> mapping the DOS/Windows CR-LF line terminator to LF on input, and
> vice versa on output.  Or the external representation might store
> each line as a fixed-length character sequence padded with spaces.

I appreciate that the editor I use keeps data consistent but allows
an explicit change between Unix and DOS text modes (where necessary
of if desired).

The most extreme context I had worked in was a company that allowed
(for every employee) a free choice of used computer technology; that
led to program text files that literally had all the inconsistencies.
Since many files were edited by different folks there where all sorts
of line terminators mixed even in the same one file, and there either
were complete last lines or not. The (some?) IDEs used were tolerant
WRT line terminators and their mixing. Other tools reacted sensibly.
The first thing I've done was to write a "C" tool to detect and fix
these sorts of inconsistencies.

Janis