Deutsch   English   Français   Italiano  
<votajb$n590$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: Janis Papanagnou <janis_papanagnou+ng@hotmail.com>
Newsgroups: comp.lang.c
Subject: Re: Buffer contents well-defined after fgets() reaches EOF ?
Date: Sun, 16 Feb 2025 19:25:46 +0100
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <votajb$n590$1@dont-email.me>
References: <vo9g74$fu8u$1@dont-email.me> <vo9hlo$g0to$1@dont-email.me>
 <vo9khf$ggd4$1@dont-email.me> <vobf3h$sefh$2@dont-email.me>
 <vobjdt$t5ka$1@dont-email.me> <vobkd5$t7np$1@dont-email.me>
 <20250210124911.00006b31@yahoo.com> <86ldu9zxkb.fsf@linuxsc.com>
 <20250214165108.00002984@yahoo.com> <20250214085627.815@kylheku.com>
 <voo6sc$3k640$1@dont-email.me> <20250215192911.0000793d@yahoo.com>
 <20250215225202.179@kylheku.com> <20250216110546.00003fb7@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 16 Feb 2025 19:25:48 +0100 (CET)
Injection-Info: dont-email.me; posting-host="aaf6e5d3524c9b1f9a0562d9d6447128";
	logging-data="759072"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19ln8DfmGHFeZt7zyeoaMjh"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
Cancel-Lock: sha1:V2cCKQ1YzWY3ySV2C08zXaCE34k=
In-Reply-To: <20250216110546.00003fb7@yahoo.com>
Bytes: 3157

On 16.02.2025 10:05, Michael S wrote:
> On Sun, 16 Feb 2025 07:32:23 -0000 (UTC)
> Kaz Kylheku <643-408-1753@kylheku.com> wrote:
> 
>> [...]
>> The strok function is ill-suited to many situations.  For instance,
>> there are situations in which you do want empty tokens, like CSV, such
>> that ",abc,def," shows four tokens, two of them empty.
>>
>> With the strspn and strcspn building blocks, you can easily whip up a
>> custom tokenizing loop that has the right semantics for the situation.
>>
>> We can also write our loop such that it restores the original
>> character that was overwritten in order to null-terminate the token,
>> simply by adding *end = more.  Thus when the loop ends, the string
>> is restored to its original state.
>>
>> I can understand code like that above without having to look up
>> anything, but if I see strtok or strtok_r code after many years of not
>> working with strtok, I will need a refresher on how exactly they
>> define a token.
> 
> For parsing of something important and relatively well-defined, like
> CSV, I'd very seriously consider option of not using standard str*
> utilities at all, with exception of those, where coding your own
> requires special expertise, i.e. primarily strtod(). BTW, even strtod()
> can't be blindly relied on for .csv, because it accepts hex floats,
> while standard CSV parser has to reject them.
> Most likely, avoiding fgets() is also a good idea in this case.

I certainly wouldn't call a CSV format as being "well-defined". But
CSV is certainly nasty enough to use some existing CSV library and
not re-invent the wheel in the first place.

Janis