Path: ...!eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.misc
Subject: Re: AWK As A Major Systems Programming Language
Date: 18 Aug 2024 11:07:50 GMT
Organization: Stefan Ram
Lines: 150
Expires: 1 Jul 2025 11:59:58 GMT
Message-ID: <awk-20240818120618@ram.dialup.fu-berlin.de>
References: <slrnvc2fsg.gg1.bencollver@svadhyaya.localdomain>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de S4GbjtD4tK9JfGVegxdOaAdV6fZ9E5EDUjuEawOyiCIQdo
Cancel-Lock: sha1:ZUEyygIcayv6ExGeWiQ3CII2wkA= sha256:3y4aqSfEZCz0yVLa+I2zT1HR/NfKUiNptO6OVscJ5ZY=
X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved.
	Distribution through any means other than regular usenet
	channels is forbidden. It is forbidden to publish this
	article in the Web, to change URIs of this article into links,
        and to transfer the body without this notice, but quotations
        of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
	services to mirror the article in the web. But the article may
	be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
Bytes: 6189

Ben Collver <bencollver@tilde.pink> wrote or quoted:
>AWK As A Major Systems Programming Language

  A systems programming language, in my book, is one you can 
  crank out device drivers in and tap into the platform ABI.

>In retrospect, it seems clear (at least to us!) that there are two
>major reasons that all of the previously mentioned languages have
>enjoyed significant popularity. The first is their extensibility. The
>second is namespace management.

  That totally makes me think of the "Zen of Python":

|The Zen of Python, by Tim Peters
|
|Beautiful is better than ugly.
|Explicit is better than implicit.
|Simple is better than complex.
|Complex is better than complicated.
|Flat is better than nested.
|Sparse is better than dense.
|Readability counts.
|Special cases aren't special enough to break the rules.
|Although practicality beats purity.
|Errors should never pass silently.
|Unless explicitly silenced.
|In the face of ambiguity, refuse the temptation to guess.
|There should be one-- and preferably only one --obvious way to do it.
|Although that way may not be obvious at first unless you're Dutch.
|Now is better than never.
|Although never is often better than *right* now.
|If the implementation is hard to explain, it's a bad idea.
|If the implementation is easy to explain, it may be a good idea.
|Namespaces are one honking great idea -- let's do more of those!
..

>I have worked for several years in Python. For string manipulation
>and processing records, you still have to write all the manual stuff:
>open the file, read lines in a loop, split them, etc. Awk does all
>this stuff for me.

  On the flip side, you can peep it like this: Python's got a solid
  set of statement types you can use for everything, making the code
  hella readable. Meanwhile, awk's got its bag of tricks for special
  cases like file and string processing. Just compare [1] with [2].

  [1]

#!/usr/bin/awk -f

# This AWK script analyzes a simple CSV file containing book information:
# Title,Author,Year,Price

BEGIN {
    FS = ","
    print "Book Analysis Report"
    print "===================="
}

{
    if (NR > 1) {  # Skip header row
        total_price += $4
        if ($3 < min_year || min_year == 0) min_year = $3
        if ($3 > max_year) max_year = $3
        
        author_count[$2]++
        year_count[$3]++
    }
}

END {
    print "\nTotal number of books:", NR - 1
    print "Average book price: $" sprintf("%.2f", total_price / (NR - 1))
    print "Year range:", min_year, "to", max_year
    
    print "\nBooks per author:"
    for (author in author_count)
        print author ":", author_count[author]
    
    print "\nBooks per year:"
    for (year in year_count)
        print year ":", year_count[year]
}

  [2]

#!/usr/bin/env python3

import csv
from dataclasses import dataclass
from collections import Counter
from typing import List, Dict, Tuple

@dataclass
class Book:
    title: str
    author: str
    year: int
    price: float

class BookAnalyzer:
    def __init__(self, books: List[Book]):
        self.books = books

    def total_books(self) -> int:
        return len(self.books)

    def average_price(self) -> float:
        return sum(book.price for book in self.books) / len(self.books)

    def year_range(self) -> Tuple[int, int]:
        years = [book.year for book in self.books]
        return min(years), max(years)

    def books_per_author(self) -> Dict[str, int]:
        return Counter(book.author for book in self.books)

    def books_per_year(self) -> Dict[int, int]:
        return Counter(book.year for book in self.books)

def read_csv(filename: str) -> List[Book]:
    with open(filename, 'r') as f:
        reader = csv.reader(f)
        next(reader)  # Skip header row
        return [Book(title, author, int(year), float(price)) 
                for title, author, year, price in reader]

def print_report(analyzer: BookAnalyzer) -> None:
    print("Book Analysis Report")
    print("====================")
    print(f"\nTotal number of books: {analyzer.total_books()}")
    print(f"Average book price: ${analyzer.average_price():.2f}")
    min_year, max_year = analyzer.year_range()
    print(f"Year range: {min_year} to {max_year}")

    print("\nBooks per author:")
    for author, count in analyzer.books_per_author().items():
        print(f"{author}: {count}")

    print("\nBooks per year:")
    for year, count in analyzer.books_per_year().items():
        print(f"{year}: {count}")

def main() -> None:
    books = read_csv("books.csv")
    analyzer = BookAnalyzer(books)
    print_report(analyzer)

if __name__ == "__main__":
    main()