Deutsch English Français Italiano |
<votajb$n590$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Janis Papanagnou <janis_papanagnou+ng@hotmail.com> Newsgroups: comp.lang.c Subject: Re: Buffer contents well-defined after fgets() reaches EOF ? Date: Sun, 16 Feb 2025 19:25:46 +0100 Organization: A noiseless patient Spider Lines: 36 Message-ID: <votajb$n590$1@dont-email.me> References: <vo9g74$fu8u$1@dont-email.me> <vo9hlo$g0to$1@dont-email.me> <vo9khf$ggd4$1@dont-email.me> <vobf3h$sefh$2@dont-email.me> <vobjdt$t5ka$1@dont-email.me> <vobkd5$t7np$1@dont-email.me> <20250210124911.00006b31@yahoo.com> <86ldu9zxkb.fsf@linuxsc.com> <20250214165108.00002984@yahoo.com> <20250214085627.815@kylheku.com> <voo6sc$3k640$1@dont-email.me> <20250215192911.0000793d@yahoo.com> <20250215225202.179@kylheku.com> <20250216110546.00003fb7@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Injection-Date: Sun, 16 Feb 2025 19:25:48 +0100 (CET) Injection-Info: dont-email.me; posting-host="aaf6e5d3524c9b1f9a0562d9d6447128"; logging-data="759072"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ln8DfmGHFeZt7zyeoaMjh" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 Cancel-Lock: sha1:V2cCKQ1YzWY3ySV2C08zXaCE34k= In-Reply-To: <20250216110546.00003fb7@yahoo.com> Bytes: 3157 On 16.02.2025 10:05, Michael S wrote: > On Sun, 16 Feb 2025 07:32:23 -0000 (UTC) > Kaz Kylheku <643-408-1753@kylheku.com> wrote: > >> [...] >> The strok function is ill-suited to many situations. For instance, >> there are situations in which you do want empty tokens, like CSV, such >> that ",abc,def," shows four tokens, two of them empty. >> >> With the strspn and strcspn building blocks, you can easily whip up a >> custom tokenizing loop that has the right semantics for the situation. >> >> We can also write our loop such that it restores the original >> character that was overwritten in order to null-terminate the token, >> simply by adding *end = more. Thus when the loop ends, the string >> is restored to its original state. >> >> I can understand code like that above without having to look up >> anything, but if I see strtok or strtok_r code after many years of not >> working with strtok, I will need a refresher on how exactly they >> define a token. > > For parsing of something important and relatively well-defined, like > CSV, I'd very seriously consider option of not using standard str* > utilities at all, with exception of those, where coding your own > requires special expertise, i.e. primarily strtod(). BTW, even strtod() > can't be blindly relied on for .csv, because it accepts hex floats, > while standard CSV parser has to reject them. > Most likely, avoiding fgets() is also a good idea in this case. I certainly wouldn't call a CSV format as being "well-defined". But CSV is certainly nasty enough to use some existing CSV library and not re-invent the wheel in the first place. Janis