Deutsch English Français Italiano |
<mailman.13.1724884345.2917.python-list@python.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!fu-berlin.de!uni-berlin.de!not-for-mail From: Thomas Passin <list1@tompassin.net> Newsgroups: comp.lang.python Subject: Re: Script stops running with no error Date: Wed, 28 Aug 2024 18:32:16 -0400 Lines: 139 Message-ID: <mailman.13.1724884345.2917.python-list@python.org> References: <87r0a8xskb.fsf@rpi3> <bb82f035-45dc-4c6f-aaec-b1e59ce825f7@tompassin.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: news.uni-berlin.de LYED85mJAHTTBAJPyfYvVw5+LGysdn3CnyfL/7dCuZAg== Cancel-Lock: sha1:sC2Jd4DACqLIXh6frAUwd7PvCN8= sha256:qwL5WuwwaSWBnNBDD10HjryqzLvVxY5xHWv9CTO8ASM= Return-Path: <list1@tompassin.net> X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org Authentication-Results: mail.python.org; dkim=pass reason="2048-bit key; unprotected key" header.d=tompassin.net header.i=@tompassin.net header.b=nMbYpGhh; dkim-adsp=pass; dkim-atps=neutral X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; '(which': 0.04; 'def': 0.04; 'containing': 0.05; 'subject:error': 0.07; 'python.': 0.08; 'approach?': 0.09; 'elif': 0.09; 'else:': 0.09; 'items.': 0.09; 'json': 0.09; 'parse': 0.09; 'url:search': 0.09; 'import': 0.15; 'that.': 0.15; "(i'm": 0.16; 'are.': 0.16; 'assesses': 0.16; 'constant': 0.16; 'definitions': 0.16; 'dictionary.': 0.16; 'intro': 0.16; 'kind,': 0.16; 'parsing': 0.16; 'properly.': 0.16; 'received:10.0.0': 0.16; 'received:64.90': 0.16; 'received:64.90.62': 0.16; 'received:64.90.62.162': 0.16; 'received:dreamhost.com': 0.16; 'structure.': 0.16; 'subject:running': 0.16; 'text)': 0.16; 'wikimedia': 0.16; 'wrote:': 0.16; 'python': 0.16; 'api': 0.17; 'pull': 0.17; 'probably': 0.17; "aren't": 0.19; 'it?': 0.19; 'pm,': 0.19; 'to:addr:python-list': 0.20; 'input': 0.21; 'maybe': 0.22; "what's": 0.22; 'code': 0.23; 'lines': 0.23; 'skip:p 30': 0.23; 'run': 0.23; '(and': 0.25; 'seems': 0.26; 'pattern': 0.26; "isn't": 0.27; 'expect': 0.28; 'output': 0.28; 'requests': 0.28; 'environment': 0.29; 'header:User-Agent:1': 0.30; 'think': 0.32; 'point,': 0.32; 'python-list': 0.32; 'received:10.0': 0.32; 'received:mailchannels.net': 0.32; 'received:relay.mailchannels.net': 0.32; 'structure': 0.32; 'but': 0.32; "i'm": 0.33; 'there': 0.33; 'script': 0.33; 'header:In- Reply-To:1': 0.34; 'complex': 0.35; 'cases': 0.36; 'those': 0.36; "skip:' 10": 0.37; 'main': 0.37; 'really': 0.37; 'using': 0.37; "it's": 0.37; 'file': 0.38; 'way': 0.38; 'two': 0.39; 'text': 0.39; 'list': 0.39; 'use': 0.39; 'break': 0.39; 'on.': 0.39; 'table': 0.39; 'wrote': 0.39; 'advantage': 0.40; 'appears': 0.40; 'match': 0.40; 'url:page': 0.40; 'want': 0.40; 'should': 0.40; 'tell': 0.60; 'method': 0.61; 'seen': 0.62; 'skip:m 20': 0.63; 'skip:b 10': 0.63; '8bit%:17': 0.63; 'skip:r 20': 0.64; 'definition': 0.64; 'produce': 0.65; 'skip:t 20': 0.66; 'now,': 0.67; 'numbers': 0.67; 'types': 0.67; 'bad': 0.67; 'header:Received:6': 0.67; 'received:64': 0.67; 'items': 0.68; 'and,': 0.69; 'content,': 0.69; 'repeatedly': 0.69; 'content': 0.72; 'url:api': 0.84; 'big,': 0.84; 'composed': 0.84; 'inherent': 0.84; 'stupid': 0.84; "wikimedia's": 0.84; 'skip:d 30': 0.86; 'sub': 0.91; 'url:wikimedia': 0.91; 'word.': 0.91 X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1724884336; a=rsa-sha256; cv=none; b=7oaaHRbgNxzCBgKXVgMCq4533edi27GKLiSHfY85Cza00+BDDZocoLRQltc/a4gqQMlFsY G6r9C/xYfl43+5OGqhZfMsgc9eQe8wk/r5lCcaDQbpX+/LH70OBNsQdjegfamrtdKgcUOp CWMQD616kSH8HvaWmCHcSi3QNxCORt35XNA1I3hfVY5jvbTUc9fUS4Ti2fOka4/hZMA/2j UnPdzEpdcfJz1E4NII1fb897456zRYmx5F/0ED2x/HEiCL7gqoqPXn3YQcAWL+gYKUgG+U I4xXs7o5zTEkq+bFblSh08WB1+eeNYycOIowzMIM/vTOhSxBFMIlaI3ghcMA8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1724884336; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rq3NGoC1JoNU22pzmHLBkX0BXD7/wqrelhmGCvI8asA=; b=VK07ogHh9AbE0rBaxL15z7f9zTwGUNN+RV00JdFrHMteNAPSxa7hqFEcfx9dzTt93tk9hn C/TMYw2a7kVwlU3+3Y+NUP8g+ZtXY+wUxbeCcaNb1UZDVA8iYukm8PcJ5nPcH4cOziVEfR mjRFtZwy0+2tLg5DBD7PjFjaT86pykyqLmkcR1i71Zx6T4ggWCwhN79ApZbdVDU5Q0m0oK 7FTXqOeoAir1aX1KQfYCSPGND1jjFZXPX9gC6GudPAjzSml8K8iLKFdnUT23g4gLGqxgGq Okn20+lu0ueQ1zjWHgKcVlQ1PJCkOwYDsCeNb/CI2MCX5KnNpLnAwAahmxpzQA== ARC-Authentication-Results: i=1; rspamd-cf944896d-8r9n5; auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net X-MailChannels-Auth-Id: dreamhost X-Cold-Reaction: 711934b90af91c7b_1724884336539_184705217 X-MC-Loop-Signature: 1724884336539:3638118120 X-MC-Ingress-Time: 1724884336539 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net; s=dreamhost; t=1724884336; bh=rq3NGoC1JoNU22pzmHLBkX0BXD7/wqrelhmGCvI8asA=; h=Date:Subject:To:From:Content-Type:Content-Transfer-Encoding; b=nMbYpGhhxdqBhKGZF8tdOIC/k8ys9wB/BSLSmeGTp6o02B6bIPTCXb5zv5yvsmqyA R8iNJN/IZSBHm0YepJJq9wENsYP5h60LuNOtjosovKlxnhZScjM65fTx19EXyUFztG kzmiBibwXKgIf9VXI5covK1hGvyoLptErCj2B8WwiVzRpEWYdkSWScsjoxaFTE91Zh sWetVKCUlNMvG4h6kMVLH/U9uDSEmtkD65KzjjsEGVqrY+N2JSU9hfZf/tfpYZzYB3 52J/bUFkLdG5Rb/AxqXZFPnZ5lycLOV6z975iKNTdxDRs8j+QW6VpU6pvlUAahTeKh gLcBZMW2pccrg== User-Agent: Mozilla Thunderbird Content-Language: en-US In-Reply-To: <87r0a8xskb.fsf@rpi3> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: General discussion list for the Python programming language <python-list.python.org> List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> List-Archive: <https://mail.python.org/pipermail/python-list/> List-Post: <mailto:python-list@python.org> List-Help: <mailto:python-list-request@python.org?subject=help> List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> X-Mailman-Original-Message-ID: <bb82f035-45dc-4c6f-aaec-b1e59ce825f7@tompassin.net> X-Mailman-Original-References: <87r0a8xskb.fsf@rpi3> Bytes: 11885 On 8/28/2024 5:09 PM, Daniel via Python-list wrote: > As you all have seen on my intro post, I am in a project using Python > (which I'm learning as I go) using the wikimedia API to pull data from > wiktionary.org. I want to parse the json and output, for now, just the > definition of the word. > > Wiktionary is wikimedia's dictionary. > > My requirements for v1 > > Query the api for the definition for table (in the python script). > Pull the proper json > Parse the json > output the definition only > > What's happening? > > I run the script and, maybe I don't know shit from shinola, but it > appears I composed it properly. I wrote the script to do the above. > The wiktionary json file denotes a list with this character # and > sublists as ## but numbers them > > On Wiktionary, the definitions are denoted like: > > 1. blablabla > 1. blablabla > 2. blablablablabla > 2. balbalbla > 3. blablabla > 1. blablabla > > > I wrote my script to alter it so that the sublist are letters > > 1. blablabla > a. blablabla > b. blablabla > 2. blablabla and so on > /snip > > At this point, the script stops after it assesses the first line_counter > and sub_counter. The code is below, please tell me which stupid mistake > I made (I'm sure it's simple). > > Am I making a bad approach? Is there an easier method of parsing json > than the way I'm doing it? I'm all ears. > > Be kind, i'm really new at python. Environment is emacs. > > import requests > import re > > search_url = 'https://api.wikimedia.org/core/v1/wiktionary/en/search/page' > search_query = 'table' > parameters = {'q': search_query} > > response = requests.get(search_url, params=parameters) > data = response.json() > > page_id = None > > if 'pages' in data: > for page in data['pages']: > title = page.get('title', '').lower() > if title == search_query.lower(): > page_id = page.get('id') > break > > if page_id: > content_url = > f'https://api.wikimedia.org/core/v1/wiktionary/en/page/ > {search_query}' > response = requests.get(content_url) > page_data = response.json() > if 'source' in page_data: > content = page_data['source'] > cases = {'noun': r'\{en-noun\}(.*?)(?=\{|\Z)', > 'verb': r'\{en-verb\}(.*?)(?=\{|\Z)', > 'adjective': r'\{en-adj\}(.*?)(?=\{|\Z)', > 'adverb': r'\{en-adv\}(.*?)(?=\{|\Z)', > 'preposition': r'\{en-prep\}(.*?)(?=\{|\Z)', > 'conjunction': r'\{en-con\}(.*?)(?=\{|\Z)', > 'interjection': r'\{en-intj\}(.*?)(?=\{|\Z)', ========== REMAINDER OF ARTICLE TRUNCATED ==========