Deutsch English Français Italiano |
<mailman.74.1717103931.2909.python-list@python.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!2.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!not-for-mail From: dn <PythonList@DancesWithMice.info> Newsgroups: comp.lang.python Subject: Re: From JoyceUlysses.txt -- words occurring exactly once Date: Fri, 31 May 2024 09:18:44 +1200 Organization: DWM Lines: 29 Message-ID: <mailman.74.1717103931.2909.python-list@python.org> References: <v3am2l$1qf6m$3@dont-email.me> <aef0bc5c-b0b6-4d7d-af05-cc22c165f327@DancesWithMice.info> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: news.uni-berlin.de cbGifQXv2HMlMiWJGISjPQSPBUjs+cH7QnsAHoLl7yJQ== Cancel-Lock: sha1:k/AtO4Ch7jPb4/1YF0mPI5eJmew= sha256:gg/PH2TaNLa7KLjkwUnEpU397Cz0Ily0nRHeKRds4ag= Return-Path: <PythonList@DancesWithMice.info> X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org Authentication-Results: mail.python.org; dkim=pass reason="2048-bit key; unprotected key" header.d=danceswithmice.info header.i=@danceswithmice.info header.b=czHZL/hA; dkim-adsp=pass; dkim-atps=neutral X-Spam-Status: OK 0.065 X-Spam-Evidence: '*H*': 0.88; '*S*': 0.01; '=dn': 0.09; 'from:addr:danceswithmice.info': 0.09; 'from:addr:pythonlist': 0.09; 'hyphenated': 0.09; 'received:192.168.1.64': 0.09; 'skip:\xc2 20': 0.09; 'message-id:@DancesWithMice.info': 0.16; 'received:cloud': 0.16; 'received:rangi.cloud': 0.16; 'skip:\xc2 60': 0.16; 'subject: -- ': 0.16; 'subject:words': 0.16; 'wrote:': 0.16; 'python': 0.16; 'to:addr:python-list': 0.20; 'code': 0.23; "i'd": 0.24; '(and': 0.25; 'header:User-Agent:1': 0.30; 'header:Organization:1': 0.31; 'program': 0.31; 'python-list': 0.32; 'split': 0.32; 'received:192.168.1': 0.32; 'but': 0.32; 'someone': 0.34; 'header:In-Reply-To:1': 0.34; 'words': 0.35; 'also,': 0.36; 'received:192.168': 0.37; 'file': 0.38; 'could': 0.38; 'text': 0.39; 'list': 0.39; 'use': 0.39; 're:': 0.64; 'exactly': 0.68; 'times': 0.69; '8bit%:100': 0.76; '(you': 0.76; 'treat': 0.76; 'counter.': 0.84; 'novel': 0.84; 'occurring': 0.84; '\xc2\xa0\xc2\xa0\xc2\xa0\xc2\xa0\xc2\xa0\xc2\xa0': 0.84; 'subject:From': 0.91; 'subject:once': 0.91; 'will.': 0.91 DKIM-Filter: OpenDKIM Filter v2.11.0 vps.rangi.cloud 24B033BA7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=danceswithmice.info; s=staff; t=1717103929; bh=25JYohB1EzAoB6SLu5j9zO+8uClwGaB6v2xk1H3ACSw=; h=Date:From:Subject:To:References:In-Reply-To:From; b=czHZL/hATAzRGxIbEP0z+FDVKJcLjOYCOeSpPj35ZDK0iOV0+ejSuIZRwW2L2ASqy VnW41Rv3Ogr7tOgkXxS8KpohZ2XkH5DcIGVQ8jsb6wV+o8wYW+DrJTwICDX8soTogz VpR0njGM5TXp7GGLhSHDwbvCZA6n5TSpcrwHXb4jrIuQQGe1Nv7bSBOzTfJF/hv1tX GWXt68MhdFg+I3jiJ7Hw6rI1aMxOWLzrG/s1mH4X9k4q6XGbsYJmh6Sn+vj+WNhkHx pCdOgGUfxf7tZbWBiew8gWGuhdYlgkUGL1gpQQWGHBH9tiL0UxxXWxWs9ywbWIupOB UoIdkVIqCelAg== User-Agent: Mozilla Thunderbird Content-Language: en-US In-Reply-To: <v3am2l$1qf6m$3@dont-email.me> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: General discussion list for the Python programming language <python-list.python.org> List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> List-Archive: <https://mail.python.org/pipermail/python-list/> List-Post: <mailto:python-list@python.org> List-Help: <mailto:python-list-request@python.org?subject=help> List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> X-Mailman-Original-Message-ID: <aef0bc5c-b0b6-4d7d-af05-cc22c165f327@DancesWithMice.info> X-Mailman-Original-References: <v3am2l$1qf6m$3@dont-email.me> Bytes: 4649 On 31/05/24 08:03, HenHanna via Python-list wrote: > > Given a text file of a novel (JoyceUlysses.txt) ... > > could someone give me a pretty fast (and simple) Python program that'd > give me a list of all words occurring exactly once? > > -- Also, a list of words occurring once, twice or 3 times > > > > re: hyphenated words (you can treat it anyway you like) > > but ideally, i'd treat [editor-in-chief] > [go-ahead] [pen-knife] > [know-how] [far-fetched] ... > as one unit. Did you mention the pay-rate for this work? Split into words - defined as you will. Use Counter. Show some (of your) code and we'll be happy to critique... -- Regards, =dn