Article <mailman.13.1724884345.2917.python-list@python.org>

Deutsch English Français Italiano
<mailman.13.1724884345.2917.python-list@python.org>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!fu-berlin.de!uni-berlin.de!not-for-mail
From: Thomas Passin <list1@tompassin.net>
Newsgroups: comp.lang.python
Subject: Re: Script stops running with no error
Date: Wed, 28 Aug 2024 18:32:16 -0400
Lines: 139
Message-ID: <mailman.13.1724884345.2917.python-list@python.org>
References: <87r0a8xskb.fsf@rpi3>
 <bb82f035-45dc-4c6f-aaec-b1e59ce825f7@tompassin.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Cancel-Lock: sha1:sC2Jd4DACqLIXh6frAUwd7PvCN8= sha256:qwL5WuwwaSWBnNBDD10HjryqzLvVxY5xHWv9CTO8ASM=
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <87r0a8xskb.fsf@rpi3>
Precedence: list
Bytes: 11885

On 8/28/2024 5:09 PM, Daniel via Python-list wrote:
> As you all have seen on my intro post, I am in a project using Python
> (which I'm learning as I go) using the wikimedia API to pull data from
> wiktionary.org. I want to parse the json and output, for now, just the
> definition of the word.
> 
> Wiktionary is wikimedia's dictionary.
> 
> My requirements for v1
> 
> Query the api for the definition for table (in the python script).
> Pull the proper json
> Parse the json
> output the definition only
> 
> What's happening?
> 
> I run the script and, maybe I don't know shit from shinola, but it
> appears I composed it properly. I wrote the script to do the above.
> The wiktionary json file denotes a list with this character # and
> sublists as ## but numbers them
> 
> On Wiktionary, the definitions are denoted like:
> 
> 1. blablabla
>      1. blablabla
>      2. blablablablabla
> 2. balbalbla
> 3. blablabla
>     1. blablabla
> 
> 
> I wrote my script to alter it so that the sublist are letters
> 
> 1. blablabla
>     a. blablabla
>     b. blablabla
> 2. blablabla and so on
> /snip
> 
> At this point, the script stops after it assesses the first line_counter
> and sub_counter. The code is below, please tell me which stupid mistake
> I made (I'm sure it's simple).
> 
> Am I making a bad approach? Is there an easier method of parsing json
> than the way I'm doing it? I'm all ears.
> 
> Be kind, i'm really new at python. Environment is emacs.
> 
> import requests
> import re
> 
> search_url = 'https://api.wikimedia.org/core/v1/wiktionary/en/search/page'
> search_query = 'table'
> parameters = {'q': search_query}
> 
> response = requests.get(search_url, params=parameters)
> data = response.json()
> 
> page_id = None
> 
> if 'pages' in data:
>      for page in data['pages']:
>          title = page.get('title', '').lower()
>          if title == search_query.lower():
>              page_id = page.get('id')
>              break
> 
> if page_id:
>      content_url =
>      f'https://api.wikimedia.org/core/v1/wiktionary/en/page/
>      {search_query}'
>      response = requests.get(content_url)
>      page_data = response.json()
>      if 'source' in page_data:
>          content = page_data['source']
>          cases = {'noun': r'\{en-noun\}(.*?)(?=\{|\Z)',
>                   'verb': r'\{en-verb\}(.*?)(?=\{|\Z)',
>                   'adjective': r'\{en-adj\}(.*?)(?=\{|\Z)',
>                   'adverb': r'\{en-adv\}(.*?)(?=\{|\Z)',
>                   'preposition': r'\{en-prep\}(.*?)(?=\{|\Z)',
>                   'conjunction': r'\{en-con\}(.*?)(?=\{|\Z)',
>                   'interjection': r'\{en-intj\}(.*?)(?=\{|\Z)',
========== REMAINDER OF ARTICLE TRUNCATED ==========