Deutsch English Français Italiano |
<v250m9$1j3gp$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Ed Morton <mortonspam@gmail.com> Newsgroups: alt.comp.lang.awk,comp.lang.awk Subject: Re: printing words without newlines? Date: Thu, 16 May 2024 08:11:35 -0500 Organization: A noiseless patient Spider Lines: 77 Message-ID: <v250m9$1j3gp$1@dont-email.me> References: <v1pi7c$2b87j$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Thu, 16 May 2024 15:11:38 +0200 (CEST) Injection-Info: dont-email.me; posting-host="a617c8b0103dc27f8226efb26503b635"; logging-data="1674777"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19XdlGlitrN4zAsyt2sU1qI" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:3Q9aYAfiMrGwNnt1A9CzFdGp9qc= X-Antivirus-Status: Clean In-Reply-To: <v1pi7c$2b87j$1@dont-email.me> X-Antivirus: Avast (VPS 240516-2, 5/16/2024), Outbound message Content-Language: en-US Bytes: 3892 On 5/11/2024 11:57 PM, David Chmelik wrote: > I'm learning more AWK basics and wrote function to read file, sort, > print. I use GNU AWK (gawk) and its sort but printing is harder to get > working than anything... separate lines work, but when I use printf() or > set ORS then use print (for words one line) all awk outputs (on FreeBSD > UNIX 14 and Slackware GNU/Linux 15) is a space (and not even newline > before shell prompt)... Your input file probably has DOS line endings, see https://stackoverflow.com/questions/45772525/why-does-my-tool-output-overwrite-itself-and-how-do-i-fix-it for what that means and how to deal with them but basically either run `dos2unix` on your file before calling awk or add `sub(\r$/,"")` as I show below*. is this normal (and I made mistake?) or am I > approaching it wrong? I recall BASIC prints new lines, but as I learned > basic C and some derivatives, I'm used to newlines only being specified... > ------------------------------------------------------------------------ > # print_file_words.awk > # pass filename to function > BEGIN { print_file_words("data.txt"); } > > # read two-column array from file and sort lines and print > function print_file_words(file) { > # set record separator then use print > # ORS=" " Move the above to a BEGIN section so it is executed once total instead of once per input line. > while(getline<file) arr[$1]=$0 The above would spin off into an infinite loop if getline failed since in that case it'd return a negative number which would still evaluate to "true" when tested as a condition. It needs to be: while ( (getline < file) > 0 ) arr[$1] = $0 See http://awk.freeshell.org/AllAboutGetline for that and more info on using getline. *This is where you'd strip CRs from the end of input lines. Do either of these, the first uses a non-POSIX extension function gensub() (which gawk has), the second would work in any awk: a) while ( (getline < file) > 0 ) arr[$1] = gensub(/\r$/,"",1) b) while ( (getline < file) > 0 ) { sub(/\r$/,""); arr[$1] = $0 } > PROCINFO["sorted_in"]="@ind_num_asc" > for(i in arr) > { > split(arr[i],arr2) > # output all words or on one line with ORS > print arr2[2] > # output all words on one line without needing ORS > #printf("%s ",arr2[2]) > } Add `print RS` after the loop if you had set ORS to a blank so the output ends in a newline and therefore is a valid POSIX text file, otherwise YMMV with what subsequent text processing tools can do with it. Ed. > } > ------------------------------------------------------------------------ > # sample data.txt > 2 your > 1 all > 3 base > 5 belong > 4 are > 7 us > 6 to