Deutsch English Français Italiano |
<mailman.78.1717229487.2909.python-list@python.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.nobody.at!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail From: "Peter J. Holzer" <hjp-python@hjp.at> Newsgroups: comp.lang.python Subject: Re: From JoyceUlysses.txt -- words occurring exactly once Date: Sat, 1 Jun 2024 10:04:29 +0200 Lines: 60 Message-ID: <mailman.78.1717229487.2909.python-list@python.org> References: <v3am2l$1qf6m$3@dont-email.me> <aef0bc5c-b0b6-4d7d-af05-cc22c165f327@DancesWithMice.info> <mailman.74.1717103931.2909.python-list@python.org> <v3bcgu$229eq$1@dont-email.me> <20240601080429.ygyg75jzdoxdofa2@hjp.at> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="ngybx5dsscshk3th" X-Trace: news.uni-berlin.de 03pZHU2kr4wYnj6CKEHpoQlpS9hftZA1DCs+sXY0DEfQ== Cancel-Lock: sha1:WEO47sNFfTzNxyCcldX7mzPSpgA= sha256:knyvwsq7jyowsSHpYAFm9guIBUhNWQFtFbvBt15xQYw= Return-Path: <hjp-python@hjp.at> X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org Authentication-Results: mail.python.org; dkim=none reason="no signature"; dkim-adsp=none (unprotected policy); dkim-atps=neutral X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'content- type:multipart/signed': 0.05; 'despite': 0.05; 'mark.': 0.07; '-0700,': 0.09; 'content-type:application/pgp-signature': 0.09; 'filename:fname piece:asc': 0.09; 'filename:fname piece:signature': 0.09; 'filename:fname:signature.asc': 0.09; '"creative': 0.16; '__/': 0.16; 'anyway.': 0.16; 'challenge!"': 0.16; 'convention,': 0.16; 'from:addr:hjp-python': 0.16; 'from:addr:hjp.at': 0.16; 'from:name:peter j. holzer': 0.16; 'hjp@hjp.at': 0.16; 'holzer': 0.16; 'reality.': 0.16; 'stick': 0.16; 'stross,': 0.16; 'subject: -- ': 0.16; 'subject:words': 0.16; 'unicode': 0.16; 'unlikely': 0.16; 'url-ip:212.17.106/24': 0.16; 'url-ip:212.17/16': 0.16; 'url:hjp': 0.16; 'word:': 0.16; '|_|_)': 0.16; 'wrote:': 0.16; 'to:addr:python-list': 0.20; "isn't": 0.27; 'sense': 0.28; 'personally': 0.32; 'python-list': 0.32; 'mark': 0.32; 'but': 0.32; 'same': 0.34; 'header:In-Reply- To:1': 0.34; 'hard': 0.37; 'single': 0.39; 'use': 0.39; 'decide': 0.39; 'both': 0.40; 'received:212': 0.62; 'between': 0.63; 'your': 0.64; 'received:userid': 0.66; '[1]': 0.67; 'right': 0.68; 'closing': 0.69; 'sentence': 0.69; 'url-ip:212/8': 0.69; 'names,': 0.81; 'left': 0.83; 'characters': 0.84; 'quotation': 0.84; 'received:at': 0.84; 'subject:From': 0.91; 'subject:once': 0.91; 'texts': 0.91 Mail-Followup-To: python-list@python.org Content-Disposition: inline In-Reply-To: <v3bcgu$229eq$1@dont-email.me> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: General discussion list for the Python programming language <python-list.python.org> List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> List-Archive: <https://mail.python.org/pipermail/python-list/> List-Post: <mailto:python-list@python.org> List-Help: <mailto:python-list-request@python.org?subject=help> List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> X-Mailman-Original-Message-ID: <20240601080429.ygyg75jzdoxdofa2@hjp.at> X-Mailman-Original-References: <v3am2l$1qf6m$3@dont-email.me> <aef0bc5c-b0b6-4d7d-af05-cc22c165f327@DancesWithMice.info> <mailman.74.1717103931.2909.python-list@python.org> <v3bcgu$229eq$1@dont-email.me> Bytes: 5950 --ngybx5dsscshk3th Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2024-05-30 19:26:37 -0700, HenHanna via Python-list wrote: > hard to decide what to do with hyphens > and apostrophes > (I'd, he's, can't, haven't, A's and B's) Especially since the same character is used as both an apostrophe and a closing quotation mark. And while that's pretty unambiguous between to characters it isn't at the end of a word: This is Alex=E2=80=99 house. This type of building is called an =E2=80=98Alex=E2=80=99 house. The sentence =E2=80=98We are meeting at Alex=E2=80=99 house=E2=80=99 co= ntains an apostrophe. (using proper unicode quotation marks. It get's worse if you stick to ASCII.) Personally I like to use U+0027 APOSTROPHE as an apostrophe and U+2018 LEFT SINGLE QUOTATION MARK and U+2019 RIGHT SINGLE QUOTATION MARK as single quotation marks[1], but despite the suggestive names, this is not the common typographical convention, so your texts are unlikely to make this distinction. hp [1] Which I use rarely, anyway. --=20 _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" --ngybx5dsscshk3th Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmZa1ggACgkQ8g5IURL+ KF2R7Q//fb65EJLMp/Z8u6JNHMsLh5J03y5Ta7PQW+YTjZU53+DUikVacvU/8iPw TlbhxdNXdUS0MCKB/EYIJNIujgafkAvbH/ItxOKutBYdGUKo/udAxU0HVJlbixQ3 Q4K/feTMy9lAPWSa4t0VWk4CaSaU9uiRQEHQaS16U5NI38RAMoovzWVZXexfXcNH YiUrfvOMGI1yfIGzHleXGV5Zxx2HVVjWgbzKep63dKh6wlM6myev6q9ggw9/UpSL gEfCbE65m77gVp7g/3DQYaCZzJ5m4fK+rxyuJCJgNDWQ1MRN8B/ohnzbN20LjdFq x4VNzKjCNLiAziPicRsWk8toDMZ0dNdibTNFORX0z/UelvBED1si6kq8tsP88zwQ qPn4p4yO2ifg2n9aC4eUm6sq5GyoaWAeGmPEXM8jubv5PndTPqJPb+TsvFiAHx5p klblfJUabuEjCNKYiKKPSuHdFbxajKWRkcLbbATZZfDYg/75CCXgDhyD6XKumP9t dcT3TFHZzbJe4ZjCRLjHPMWNY5fHnJnDKUG+h7YmgkUW2S8EwG/L2tp8MnkQLvju tjm6Abi+MNNao3ySQJqqBAJ3d9aB6JS5Qru0LARPTV7pBPomTleeEcdY8Jc21VM2 Zn+NSbxSOlVw932rJJy09tWFjC04PTDsh9/QqPkajexaI9OuZtc= =uCJc -----END PGP SIGNATURE----- --ngybx5dsscshk3th--