Path: ...!3.eu.feeder.erje.net!feeder.erje.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Ed Morton Newsgroups: comp.lang.awk Subject: Re: [gawk] Handling variants of CSV input data formats Date: Mon, 26 Aug 2024 19:49:00 -0500 Organization: A noiseless patient Spider Lines: 29 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Tue, 27 Aug 2024 02:49:00 +0200 (CEST) Injection-Info: dont-email.me; posting-host="0da34e7009e7efa5655ea33214084bce"; logging-data="2811427"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19gXGkKWQaoG345S5oZ8yCE" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:MEuRFvbt+9yU3LaU7+7KmN4UGOw= In-Reply-To: Content-Language: en-US X-Antivirus: Avast (VPS 240826-4, 8/26/2024), Outbound message X-Antivirus-Status: Clean Bytes: 2050 On 8/26/2024 7:54 AM, Janis Papanagnou wrote: >snip> > I'd have liked to provide more concrete information here, but I'm at > the moment even unable to reproduce Awk's behavior as documented in > its manual; I've tried the following command with various locales > > $ echo 4,321 | LC_ALL=en_DK.utf-8 gawk '{ print $1 + 1 }' > -| 5,321 > > but always got just 5 as result. You need to specifically TELL gawk to use your locale to read input numbers: $ echo 4,321 | LC_ALL=en_DK.utf-8 gawk '{ print $1 + 1 }' 5 $ echo 4,321 | POSIXLY_CORRECT=1 LC_ALL=en_DK.utf-8 gawk '{ print $1 + 1 }' 5,321 $ echo 4,321 | LC_ALL=en_DK.utf-8 gawk -N '{ print $1 + 1 }' 5,321 See https://www.gnu.org/software/gawk/manual/gawk.html#Locale-influences-conversions for more info on that. Regards, Ed