Deutsch English Français Italiano |
<v3c7st$26biv$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: "B. Pym" <No_spamming@noWhere_7073.org> Newsgroups: comp.lang.lisp,comp.lang.scheme Subject: Re: From JoyceUlysses.txt -- words occurring exactly once Date: Fri, 31 May 2024 10:13:50 -0000 (UTC) Organization: A noiseless patient Spider Lines: 62 Message-ID: <v3c7st$26biv$1@dont-email.me> References: <v3ame4$1qf6m$5@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Injection-Date: Fri, 31 May 2024 12:13:51 +0200 (CEST) Injection-Info: dont-email.me; posting-host="6c2b9b9238357433b68a6ad6acbc6363"; logging-data="2305631"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+lq/+0ukfdWOEHT9W9Ot2H" User-Agent: XanaNews/1.18.1.6 Cancel-Lock: sha1:XLLkddecDl9FUISxDGw2H0gfzv4= Bytes: 2954 On 5/30/2024, HenHanna wrote: > > i'd not use Gauche for this, but maybe someone can change my mind. > > > _______________________ > From JoyceUlysses.txt -- words occurring exactly once > > > Given a text file of a novel (JoyceUlysses.txt) ... > > could someone give me a pretty fast (and simple) program that'd give me a list of all words occurring exactly once? > > -- Also, a list of words occurring once, twice or 3 times > > > > re: hyphenated words (you can treat it anyway you like) > > ideally, i'd treat [editor-in-chief] > [go-ahead] [pen-knife] > [know-how] [far-fetched] ... > as one unit. Gauche Scheme (use file.util) ;; file->string (use srfi-13) ;; character sets (use srfi-14) ;; string-tokenize (define h (make-hash-table 'string=?)) (dolist (s (string-tokenize (file->string "Alice.txt") (char-set-adjoin char-set:letter #\-))) (hash-table-update! h (regexp-replace* (string-upcase s) #/^-+/ "" #/-+$/ "") (pa$ + 1) 0)) (filter (lambda(kv) (< (cdr kv) 3)) (hash-table->alist h)) ===> (("LASTED" . 2) ("WAY--NEVER" . 1) ("VISIT" . 1) ("CHANCED" . 1) ("WILDLY" . 2) ("BEHEAD" . 1) ("PROMISE" . 1) ("MEANWHILE" . 1) ("ENGAGED" . 1) ("KNIFE" . 2) ("ROARED" . 1) ("RETIRE" . 1) ("BLACKING" . 1) ("HATED" . 1) ("BRIGHT-EYED" . 1) ("SHEEP-BELLS" . 1) ("PROTECTION" . 1) ("CRIES" . 1) ("ADA" . 1) ("ENJOY" . 1) ("WRITHING" . 1) ("RAW" . 1) ("APPEALED" . 1) ("RELIEVED" . 1) ("CHILDHOOD" . 1) ("WEPT" . 1) ("RACE-COURSE" . 1) ("THEIRS" . 1) ("MAD--AT" . 1) ("SPOKEN" . 1) ("PENCILS" . 1) ("CLEAR" . 2) ("TREADING" . 2) ("RETURNED" . 2) ("CHERRY-TART" . 1) ("UNEASY" . 1) ("LOW-SPIRITED" . 1) ("BONE" . 1) ("PROMISED" . 1) ("HAPPENING" . 1) ("OYSTER" . 1) ("PATIENTLY" . 2) ("NEEDS" . 1) ("LESSON-BOOK" . 1) ("PITIED" . 1) ("UNCOMFORTABLY" . 1) ("ANTIPATHIES" . 1) ("PICTURED" . 1) ("DESPERATE" . 1) ("ENGRAVED" . 1) ... )