Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Janis Papanagnou Newsgroups: comp.lang.awk Subject: [gawk] Handling variants of CSV input data formats Date: Sun, 25 Aug 2024 08:00:20 +0200 Organization: A noiseless patient Spider Lines: 18 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Injection-Date: Sun, 25 Aug 2024 08:00:23 +0200 (CEST) Injection-Info: dont-email.me; posting-host="90655ec1e2926ac3b71bfd9fd1107a5c"; logging-data="1883662"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Qsl9AkM38O8qQkxqtJMZM" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 Cancel-Lock: sha1:Xl2DZqIs/+n59bgMdGnj2UZRoM4= X-Mozilla-News-Host: news://news.eternal-september.org:119 X-Enigmail-Draft-Status: N1110 Bytes: 1880 Myself I'm usually not using CSV format(s), but recently I advertised GNU Awk (given that newer versions support CSV data processing) to a friend seeking CSV solutions. I was quite astonished when I stumbled across a StackOverflow article about CSV processing with contemporary versions of GNU Awk and read that you are restricted to comma as separator and double quotes to enclose strings. The workarounds provided at SO were extremely clumsy. Given that using ',', ';', '|' (or other delimiters) and also various types of quotes are just a lexical (no functional) difference I wonder whether it would be sensible to be able to define them, say, through setting a PROCINFO element? Janis https://stackoverflow.com/questions/45420535/whats-the-most-robust-way-to-efficiently-parse-csv-using-awk