Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <mailman.93.1717699659.2909.python-list@python.org>
Deutsch   English   Français   Italiano  
<mailman.93.1717699659.2909.python-list@python.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!2.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!not-for-mail
From: Thomas Passin <list1@tompassin.net>
Newsgroups: comp.lang.python
Subject: Re: From JoyceUlysses.txt -- words occurring exactly once
Date: Wed, 5 Jun 2024 07:10:19 -0400
Lines: 85
Message-ID: <mailman.93.1717699659.2909.python-list@python.org>
References: <v3am2l$1qf6m$3@dont-email.me>
 <aef0bc5c-b0b6-4d7d-af05-cc22c165f327@DancesWithMice.info>
 <mailman.74.1717103931.2909.python-list@python.org>
 <v3bcgu$229eq$1@dont-email.me>
 <3dedbc3b-7db0-4a39-863f-56324d434b12@DancesWithMice.info>
 <8409fd89-8b42-43c4-8511-704d57b3a4be@tompassin.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de CKIbykCGcB194PY/GaRrHQ4B2fsnWfg8+/mljxV7F7rA==
Cancel-Lock: sha1:js8ErNhaGGJ+/1i1kOyg6Yqyl8o= sha256:kK4Br7WRlmk28E1oVXktg6z+wQqX6o0TN/5vuwZmo04=
Return-Path: <list1@tompassin.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
 reason="2048-bit key; unprotected key"
 header.d=tompassin.net header.i=@tompassin.net header.b=nUnuuDQ1;
 dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: UNSURE 0.206
X-Spam-Level: **
X-Spam-Evidence: '*H*': 0.59; '*S*': 0.00; 'coders': 0.05; 'skip:\xc2
 30': 0.07; 'tests': 0.07; 'hyphenated': 0.09; 'insist': 0.09;
 'skip:\xc2 20': 0.09; 'import': 0.15; '2.\xc2\xa0': 0.16; '>>>>':
 0.16; 'nuances': 0.16; 'received:10.0.0': 0.16; 'received:64.90':
 0.16; 'received:64.90.62': 0.16; 'received:64.90.62.162': 0.16;
 'received:dreamhost.com': 0.16; 'reminded': 0.16; 'skip:\xc2 60':
 0.16; 'solved': 0.16; 'subject: -- ': 0.16; 'subject:words': 0.16;
 'tests,': 0.16; 'wrote:': 0.16; 'python': 0.16; "can't": 0.17;
 'pm,': 0.19; 'to:addr:python-list': 0.20; 'issue': 0.21;
 'integration': 0.22; 'code': 0.23; "i'd": 0.24; '(and': 0.25;
 'python,': 0.25; 'programming': 0.25; 'listing': 0.26; 'else':
 0.27; '>>>': 0.28; 'teacher': 0.28; 'header:User-Agent:1': 0.30;
 'attempt': 0.31; 'code,': 0.31; 'am,': 0.31; 'program': 0.31;
 'do.': 0.32; 'python-list': 0.32; 'realize': 0.32;
 'received:10.0': 0.32; 'received:mailchannels.net': 0.32;
 'received:relay.mailchannels.net': 0.32; 'split': 0.32; 'skip:2
 10': 0.32; 'but': 0.32; "i'm": 0.33; 'there': 0.33; 'someone':
 0.34; 'able': 0.34; 'header:In-Reply-To:1': 0.34; 'words': 0.35;
 'also,': 0.36; 'possibly': 0.36; 'using': 0.37; "it's": 0.37;
 'hard': 0.37; 'this.': 0.37; 'file': 0.38; 'could': 0.38; 'text':
 0.39; 'otherwise': 0.39; 'list': 0.39; 'use': 0.39; 'decide':
 0.39; 'finding': 0.39; 'define': 0.40; 'learn': 0.40; 'try': 0.40;
 'should': 0.40; 'lack': 0.60; 'url-ip:104.21/16': 0.61; 'seen':
 0.62; 'skip:\xc2 10': 0.62; 'here': 0.62; 'come': 0.62; 'skip:b
 10': 0.63; 'our': 0.64; 'complete': 0.64; 'skip:r 20': 0.64;
 'clear': 0.64; 'full': 0.64; 're:': 0.64; 'years': 0.65; 'back':
 0.67; 'header:Received:6': 0.67; 'received:64': 0.67; 'per': 0.68;
 'exactly': 0.68; 'acceptable': 0.69; 'acceptance': 0.69;
 'clarity': 0.69; 'counter': 0.69; 'manner': 0.69; 'times': 0.69;
 'truly': 0.70; 'interesting': 0.71; 'history': 0.75; '8bit%:100':
 0.76; '(you': 0.76; 'supposed': 0.76; 'treat': 0.76; 'seek': 0.81;
 'unit': 0.81; 'counter.': 0.84; 'initiative,': 0.84; 'novel':
 0.84; 'occurring': 0.84; 'url:blogs': 0.84; 'sad': 0.91;
 'subject:From': 0.91; 'subject:once': 0.91; 'will.': 0.91;
 'aspects': 0.93; 'ibm': 0.95
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1717585820; a=rsa-sha256;
 cv=none;
 b=DhvfwQvygwK0fAubR9mnMVK5XTdAcynfoBsdYs5TTWCj77pLMOd5RVYyGQS3nYVzHrgjk2
 +nkaBDSNgIZdTkl/oY/7Mcb/VV8e9UjAJlBVE3+4oEQcmrdlR/YV28dx+FiUQwwyg6B/Wn
 LWxNCIY30ppZeQWbh6bZO8EXApZK9q/vlsPT+5jopgg63E4ZSUaa2toqciDk7FBf+t8KuX
 R9u9CTAivRk4tJQjgv4G/EKrL5Hnco0sRppNPOhZolRoKbm+kJycAQyFjzAofegULaRoIK
 fU5WOmVzabmL9phFXibhpa4RXNb0FUkD4MmqbOCXPIomUIpn1aj/gQVkR+7h1A==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=mailchannels.net; s=arc-2022; t=1717585820;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references:dkim-signature;
 bh=MPaKaPa4ON/2KTDFReCnaPWc6NkMRSGnJ7LH/y6fW+o=;
 b=iN3ZUO3kC1H68kU4JD0TnpTtRTEDDUEyG2kwUnOFm03WYKgi1j68fGmKiHzN2ALU2b2zFi
 Doq/+w10b3yCEcb0VE48vHVuA7BXT7wTfFhGHsv/0GRSHh4eWrdHpI53pPhHyHWz+CXTUD
 bfY8Kn4ZnaIgnRQkNt4HfLBhnqGTSS6yVsNFEp4m+s9xX4ME+zNagJJwQvG4jq6B1Ah+lu
 A9JdumY7vcsVXP+XaQL2dNZh8zuBHBSKOj9yXGyUtXF7chjuOI09GvurlazQYpcFV0MVly
 6jnJTfcKdxrMCU8l6NTO8j8zSKNgAgMnDRVJvgxFLRoLyjrgnap65QED78pobg==
ARC-Authentication-Results: i=1; rspamd-7f76976655-hc9r6;
 auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net
X-MailChannels-Auth-Id: dreamhost
X-Befitting-Army: 291ef1c337117b89_1717585821504_1005025361
X-MC-Loop-Signature: 1717585821504:1814110104
X-MC-Ingress-Time: 1717585821504
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net;
 s=dreamhost; t=1717585820;
 bh=MPaKaPa4ON/2KTDFReCnaPWc6NkMRSGnJ7LH/y6fW+o=;
 h=Date:Subject:To:From:Content-Type:Content-Transfer-Encoding;
 b=nUnuuDQ1QWi9Ns6feXzMNOKyWw7pu12cgz2wysZ/a2TqcOiK5tBStHcHR6me2iQIs
 YW5EI6wWjbuKHfWYI9LVzAmTWXaenmKHilto/QZXtfK+1JWjeuY43v7Q1kwK1BDPdR
 zRNgG2NOxXE2UynNaOvfOskW/vjOU3KwSjiPOXY1thzWY54QpA6ldNHRZY5DD4VQDJ
 +FxkcfloYXXhLcvitAVzW0VLWSPhsDAG925Sw/huP8b7R3KVSlbBBkDq6CkTgte1CY
 ZQ5rK5qnEqW1SjOB+g8Tr2Kb9+f6otJFf/vqKLd5m3ke8twbQSI6GWXbjq4KPIniry
 faUjuR/2KjBIQ==
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <3dedbc3b-7db0-4a39-863f-56324d434b12@DancesWithMice.info>
X-Mailman-Approved-At: Thu, 06 Jun 2024 14:47:38 -0400
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
 <python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
 <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
 <mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <8409fd89-8b42-43c4-8511-704d57b3a4be@tompassin.net>
X-Mailman-Original-References: <v3am2l$1qf6m$3@dont-email.me>
 <aef0bc5c-b0b6-4d7d-af05-cc22c165f327@DancesWithMice.info>
 <mailman.74.1717103931.2909.python-list@python.org>
 <v3bcgu$229eq$1@dont-email.me>
 <3dedbc3b-7db0-4a39-863f-56324d434b12@DancesWithMice.info>
Bytes: 10335

On 6/5/2024 12:33 AM, dn via Python-list wrote:
> On 31/05/24 14:26, HenHanna via Python-list wrote:
>> On 5/30/2024 2:18 PM, dn wrote:
>>> On 31/05/24 08:03, HenHanna via Python-list wrote:
>>>>
>>>> Given a text file of a novel (JoyceUlysses.txt) ...
>>>>
>>>> could someone give me a pretty fast (and simple) Python program 
>>>> that'd give me a list of all words occurring exactly once?
>>>>
>>>>                -- Also, a list of words occurring once, twice or 3 
>>>> times
>>>>
>>>>
>>>>
>>>> re: hyphenated words        (you can treat it anyway you like)
>>>>
>>>>         but ideally, i'd treat  [editor-in-chief]
>>>>                                 [go-ahead]  [pen-knife]
>>>>                                 [know-how]  [far-fetched] ...
>>>>         as one unit.
>>
>>
>>>
>>> Split into words - defined as you will.
>>> Use Counter.
>>>
>>> Show some (of your) code and we'll be happy to critique...
>>
>>
>> hard to decide what to do with hyphens
>>                 and apostrophes
>>               (I'd,  he's,  can't, haven't,  A's  and  B's)
>>
>>
>> 2-step-Process
>>
>>            1. make a file listing all words (one word per line)
>>
>>            2.  then, doing the counting.  using
>>                                from collections import Counter
> 
> 
> Apologies for lateness - only just able to come back to this.
> 
> This issue is not Python, and is not solved by code!
> 
> If you/your teacher can't define a "word", the code, any code, will 
> almost-certainly be wrong!
> 
> 
> One of the interesting aspects of our work is that we can write all 
> manner of tests to try to ensure that the code is correct: unit tests, 
> integration tests, system tests, acceptance tests, eye-tests, ...
> 
> However, there is no such thing as a test (or proof) that statements of 
> requirements are complete or correct!
> (nor for any other previous stages of the full project life-cycle)
> 
> As coders we need to learn to require clear specifications and not 
> attempt to read-between-the-lines, use our initiative, or otherwise 'not 
> bother the ...'. When there is ambiguity, we should go back to the 
> user/client/boss and seek clarification. They are the 
> domain/subject-matter experts...
> 
> I'm reminded of a cartoon, possibly from some IBM source, first seen in 
> black-and-white but here in living-color: 
> https://www.monolithic.org/blogs/presidents-sphere/what-the-customer-really-wants

That one's been kicking around for years ... good job in finding a link 
for it!

> That has been the sad history of programming and dev.projects - wherein 
> we are blamed for every short-coming, because no-one else understands 
> the nuances of development projects.
========== REMAINDER OF ARTICLE TRUNCATED ==========