Path: ...!eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail From: ram@zedat.fu-berlin.de (Stefan Ram) Newsgroups: comp.misc Subject: Re: AWK As A Major Systems Programming Language Date: 18 Aug 2024 11:07:50 GMT Organization: Stefan Ram Lines: 150 Expires: 1 Jul 2025 11:59:58 GMT Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: news.uni-berlin.de S4GbjtD4tK9JfGVegxdOaAdV6fZ9E5EDUjuEawOyiCIQdo Cancel-Lock: sha1:ZUEyygIcayv6ExGeWiQ3CII2wkA= sha256:3y4aqSfEZCz0yVLa+I2zT1HR/NfKUiNptO6OVscJ5ZY= X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved. Distribution through any means other than regular usenet channels is forbidden. It is forbidden to publish this article in the Web, to change URIs of this article into links, and to transfer the body without this notice, but quotations of parts in other Usenet posts are allowed. X-No-Archive: Yes Archive: no X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some services to mirror the article in the web. But the article may be kept on a Usenet archive server with only NNTP access. X-No-Html: yes Content-Language: en-US Bytes: 6189 Ben Collver wrote or quoted: >AWK As A Major Systems Programming Language A systems programming language, in my book, is one you can crank out device drivers in and tap into the platform ABI. >In retrospect, it seems clear (at least to us!) that there are two >major reasons that all of the previously mentioned languages have >enjoyed significant popularity. The first is their extensibility. The >second is namespace management. That totally makes me think of the "Zen of Python": |The Zen of Python, by Tim Peters | |Beautiful is better than ugly. |Explicit is better than implicit. |Simple is better than complex. |Complex is better than complicated. |Flat is better than nested. |Sparse is better than dense. |Readability counts. |Special cases aren't special enough to break the rules. |Although practicality beats purity. |Errors should never pass silently. |Unless explicitly silenced. |In the face of ambiguity, refuse the temptation to guess. |There should be one-- and preferably only one --obvious way to do it. |Although that way may not be obvious at first unless you're Dutch. |Now is better than never. |Although never is often better than *right* now. |If the implementation is hard to explain, it's a bad idea. |If the implementation is easy to explain, it may be a good idea. |Namespaces are one honking great idea -- let's do more of those! .. >I have worked for several years in Python. For string manipulation >and processing records, you still have to write all the manual stuff: >open the file, read lines in a loop, split them, etc. Awk does all >this stuff for me. On the flip side, you can peep it like this: Python's got a solid set of statement types you can use for everything, making the code hella readable. Meanwhile, awk's got its bag of tricks for special cases like file and string processing. Just compare [1] with [2]. [1] #!/usr/bin/awk -f # This AWK script analyzes a simple CSV file containing book information: # Title,Author,Year,Price BEGIN { FS = "," print "Book Analysis Report" print "====================" } { if (NR > 1) { # Skip header row total_price += $4 if ($3 < min_year || min_year == 0) min_year = $3 if ($3 > max_year) max_year = $3 author_count[$2]++ year_count[$3]++ } } END { print "\nTotal number of books:", NR - 1 print "Average book price: $" sprintf("%.2f", total_price / (NR - 1)) print "Year range:", min_year, "to", max_year print "\nBooks per author:" for (author in author_count) print author ":", author_count[author] print "\nBooks per year:" for (year in year_count) print year ":", year_count[year] } [2] #!/usr/bin/env python3 import csv from dataclasses import dataclass from collections import Counter from typing import List, Dict, Tuple @dataclass class Book: title: str author: str year: int price: float class BookAnalyzer: def __init__(self, books: List[Book]): self.books = books def total_books(self) -> int: return len(self.books) def average_price(self) -> float: return sum(book.price for book in self.books) / len(self.books) def year_range(self) -> Tuple[int, int]: years = [book.year for book in self.books] return min(years), max(years) def books_per_author(self) -> Dict[str, int]: return Counter(book.author for book in self.books) def books_per_year(self) -> Dict[int, int]: return Counter(book.year for book in self.books) def read_csv(filename: str) -> List[Book]: with open(filename, 'r') as f: reader = csv.reader(f) next(reader) # Skip header row return [Book(title, author, int(year), float(price)) for title, author, year, price in reader] def print_report(analyzer: BookAnalyzer) -> None: print("Book Analysis Report") print("====================") print(f"\nTotal number of books: {analyzer.total_books()}") print(f"Average book price: ${analyzer.average_price():.2f}") min_year, max_year = analyzer.year_range() print(f"Year range: {min_year} to {max_year}") print("\nBooks per author:") for author, count in analyzer.books_per_author().items(): print(f"{author}: {count}") print("\nBooks per year:") for year, count in analyzer.books_per_year().items(): print(f"{year}: {count}") def main() -> None: books = read_csv("books.csv") analyzer = BookAnalyzer(books) print_report(analyzer) if __name__ == "__main__": main()