Deutsch   English   Français   Italiano  
<ustdr6$17a4o$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!npeer.as286.net!npeer-ng0.as286.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Ed Morton <mortonspam@gmail.com>
Newsgroups: comp.lang.awk
Subject: Re: Breaking a table of record rows into an array
Date: Wed, 13 Mar 2024 18:45:41 -0500
Organization: A noiseless patient Spider
Lines: 109
Message-ID: <ustdr6$17a4o$1@dont-email.me>
References: <urslg4$18isd$1@toylet.eternal-september.org>
 <ursq3s$19kef$1@dont-email.me> <usnfod$3o4es$1@toylet.eternal-september.org>
 <usqkgn$he7u$2@dont-email.me> <65f17028$0$707$14726298@news.sunsite.dk>
 <87y1am5cfo.fsf@nosuchdomain.example.com> <20240313110839.989@kylheku.com>
 <87h6h96df7.fsf@nosuchdomain.example.com> <20240313143157.115@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 13 Mar 2024 23:45:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="42ac1681d1342478fe121f06fe65a95c";
	logging-data="1288344"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/3aFVg3gxTxH1RnegmyXcV"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:+XtTUerI4Et0AxqIQ7GnSvicoYE=
X-Antivirus: Avast (VPS 240313-12, 3/13/2024), Outbound message
In-Reply-To: <20240313143157.115@kylheku.com>
X-Antivirus-Status: Clean
Content-Language: en-US
Bytes: 6614

On 3/13/2024 4:49 PM, Kaz Kylheku wrote:
> On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
>> Kaz Kylheku <433-929-6894@kylheku.com> writes:
>>> On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
>>>> arnold@freefriends.org (Aharon Robbins) writes:
>>>>> In article <usqkgn$he7u$2@dont-email.me>,
>>>>> Ed Morton  <mortonspam@gmail.com> wrote:
>>>>>> the effect of setting `NF` is
>>>>>> undefined behavior per POSIX and so will do different things in
>>>>>> different awk variants and even in 1 awk variant can behave differently
>>>>>> depending on whether you're setting it to a higher or lower than
>>>>>> original value
>>>>>
>>>>> This is not true. The effect of setting NF was well defined
>>>>> by the original awk book and also in POSIX.
>>>>>
>>>>> Decreasing NF throws away fields. Increasing NF adds the
>>>>> intervening fields with the null string as their values
>>>>> and rebuilds the record.
>>>>
>>>> I don't see that in the POSIX specification.
>>>
>>> The key is this:
>>>
>>>    References to nonexistent fields (that is, fields after $NF), shall
>>>    evaluate to the uninitialized value.
>>>
>>> NF is assignable, and fields after $NF do not exist. Thus if we
>>> have four fields and set NF = 3, then $4 doesn't exist.

That's a bit like the argument from an old episode of the comedy TV show 
"Yes, Prime Minister" in the UK where his aide says (paraphrased) "Some 
country has done X, we must go something. War is something, therefore we 
must go to war".

Being able to set NF to 3 does not mean you must delete $4. Why not 
delete $1 or $2 instead? You'd still end up with 3 fields to satisfy the 
value of NF. Lots of things you can do are undefined by POSIX despite 
how sensible some impacts may seem, assigning a value to NF is just 1 
more of them.

You could say that "$0 holds the last record read, you can use $0 in the 
END section, therefore in the END section $0 must contain the value of 
the last record read". Except that's not true. From the gawk manual 
(https://www.gnu.org/software/gawk/manual/html_node/I_002fO-And-BEGIN_002fEND.html#I_002fO-And-BEGIN_002fEND):

----
Most probably due to an oversight, the standard does not say that $0 is 
also preserved, although logically one would think that it should be. In 
fact, all of BWK awk, mawk, and gawk preserve the value of $0 for use in 
END rules. Be aware, however, that some other implementations and many 
older versions of Unix awk do not.
----

>>
>> That describes what happens if NF is modified by assignment, but I don't
>> see that it implies that such an assignment is allowed.
> 
> "The left-hand side of an assignment and the target of increment and
> decrement operators can be one of a variable, an array with index, or a
> field selector."
> 
> NF is described as a variable.  Some unique remarks are made about NF,
> but none deny that it's assignable like any other variable.
> 
>> But I can imagine a hypothetical awk-like language in which assigning to
>> NF has undefined behavior.  My question is, how does the POSIX
>> specification not describe that language?
> 
> That language is failing to support an instance of a variable
> being the left operand of an assignment, which a variable "can be".
> 
> It looks like the violation of a requirement.
> 
>> On the other hand, it also implies that `foo = 42` is valid where `foo`
>> is the name of a user-defined function (gawk disallows it).
> 
> POSIX does say that "[t]he same name shall not be used as both a
> function parameter name and as the name of a function or a special awk
> variable." So foo = 42 isn't valid if foo is already a function.
> 
> Also: "The same name shall not be used both as a variable name with
> global scope and as the name of a function. The same name shall not be
> used within the same scope both as a scalar variable and as an array."
> 
> All that said, the business of the NF tail wagging the $1, $2, ...
> legs of the dog should be the target of at least one clarifying remark,
> and the other defects should also be corrected:
> 
> - In a BEGIN clause NF should be undefined unless any action
>    whatsoever is executed that sets its value: direct assignment,
>    use of getline or assignment to $0.
> 
> - At the start of the execution of an END clause, NF retains
>    its current value (or undefined status, if it was never set);
>    the END clause has no implicit effect on NF.
> 

All of the above claims that POSIX states you can assign a value to NF. 
That may or may not be correct, I expect it is but I don't care because 
nothing above nor in the POSIX spec states what the IMPACT is of 
assigning a value to NF. As far as I can see there is absolutely nothing 
in the POSIX spec that says anything like "if you set NF to a higher 
value fields will be created and if you set NF to a lower value fields 
will be removed" but I'd honestly love to be proven wrong and shown the 
section that does defined the impact of assigning a higher or lower 
value to NF.

      Ed.