Deutsch   English   Français   Italiano  
<vcj5k6$vu13$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: "B. Pym" <Nobody447095@here-nor-there.org>
Newsgroups: comp.lang.lisp,comp.lang.scheme
Subject: Re: A simple Lisp program -- algorithm and speed issues
Date: Fri, 20 Sep 2024 06:44:24 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 73
Message-ID: <vcj5k6$vu13$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Injection-Date: Fri, 20 Sep 2024 08:44:24 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="a0826257f8080eff71f9956e8770cd16";
	logging-data="1046563"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+9pi97Cqblx4jBWC4/4RJt"
User-Agent: XanaNews/1.18.1.6
Cancel-Lock: sha1:26IZD4JyZxtIKuqJKis2eW8AUcg=
Bytes: 2412

Hrvoje Niksic wrote:

> Here is an interesting, not entirely academic problem that me and a
> colleague are "wrestling" with.  Say there is a file, containing
> entries like this:
> 
> foo 5
> bar 20
> baz 4
> foo 6
> foobar 23
> foobar 3
> ...
> 
> There are a lot of lines in the file (~10000), but many of the words
> repeat (there are ~500 unique words).  We have endeavored to write a
> program that would sum the occurences of each word, and display them


I think he means: sum the numbers associated with the words.


> sorted alphabetically, e.g.:
> 
> bar 20
> baz 4
> foo 11
> foobar 26
> ...



The file contains:

foo 5
bar 20
baz 4
foo 6
foobar 23
foobar 3
bar 68
baz 33

Gauche Scheme

(define (process file)
  (let1 result '()
    (with-input-from-file file
      (cut  generator-for-each 
        (lambda (item)
          (ainc! result (symbol->string item) (read)))
        read))
    (sort result string<? car)))

(process "output.dat")
  ===>
(("bar" . 88) ("baz" . 37) ("foo" . 11) ("foobar" . 26))

Given:

(define-syntax ainc!
  (syntax-rules ()
    [(_ alist key val func default)
     (let ((pair (assoc key alist)))
       (if pair
         (set-cdr! pair (func val (cdr pair)))
         (set! alist (cons (cons key (func val default)) alist))))]
    [(_ alist key val func)
     (ainc! alist key val func 0)]
    [(_ alist key val)
     (ainc! alist key val +)]
    [(_ alist key)
     (ainc! alist key 1)]))