Path: ...!weretis.net!feeder9.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail From: gazelle@shell.xmission.com (Kenny McCormack) Newsgroups: comp.lang.awk Subject: A feature I'd like to see in GAWK... Date: Mon, 15 Jul 2024 18:28:31 -0000 (UTC) Organization: The official candy of the new Millennium Message-ID: Injection-Date: Mon, 15 Jul 2024 18:28:31 -0000 (UTC) Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4"; logging-data="3684133"; mail-complaints-to="abuse@xmission.com" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: gazelle@shell.xmission.com (Kenny McCormack) Bytes: 3024 Lines: 49 As we know, AWK in general, and GAWK in particular, has several different ways of getting data into the program. In addition to the Automatic Input Loop (the main feature of AWK), there are several variations of "getline". "getline" can be used with files, or with processes (in 2 different ways!), or even with network sockets. But the problem with getline is that using it breaks the Automatic Input Loop. You can't use the standard "pattern/action" paradigm if your input is coming in via "getline". Yes, there are workarounds and yes we've all gotten used to it, but it is a shame. For one thing, you can write your program as a shell script, and use the shell to pipe in the data from a process. But this is ugly. And not always sufficient. Now, I have written a GAWK extension to handle this - called "pipeline". Here is a sample script that uses "pipeline". Note that the Linux "df" command has a "-l" option to show you only the local filesystems, but what I usually want is the non-local ones - that's much more interesting. The only way I can figure how to get that is to run "df" twice and compare the output with and without "-l". Here is my program (non-local-df): --- Cut Here --- @load "pipeline" @include "abort" # Note: You can ignore the "abort" stuff. It is part of my ecosystem, but # probably not part of yours. BEGIN { testAbort(ARGC > 1,"This program takes no args!!!",1) pipeline("in","df -l") while (ARGC < 3) ARGV[ARGC++] = "-" } ENDFILE { if (ARGIND == 1) pipeline("in","df") } ARGIND == 1 { x[$1]; next } FNR == 1 || !($1 in x) --- Cut Here --- Needless to say, I'd like to see this sort of functionality built-in. It seems to me that GAWK has been sort of fishing around lately looking for new worlds to conquer. Some features have been added lately that seem (to me anyway) sort of "out of place". namespaces, MPFR arithmetic (apparently, now deprecated), persistent memory (nifty idea, though I don't really see the practicality - and have not gotten around to testing it - i.e., compiling up a new enough version to try it). I think something like the above would be more in line with the sort of things I'd like to see in GAWK. -- Adderall, pseudoephed, teleprompter