Article <mailman.14.1724890041.2917.python-list@python.org>

Deutsch English Français Italiano
<mailman.14.1724890041.2917.python-list@python.org>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!npeer.as286.net!npeer-ng0.as286.net!fu-berlin.de!uni-berlin.de!not-for-mail
From: dn <PythonList@DancesWithMice.info>
Newsgroups: comp.lang.python
Subject: Re: Script stops running with no error
Date: Thu, 29 Aug 2024 12:07:07 +1200
Organization: DWM
Lines: 136
Message-ID: <mailman.14.1724890041.2917.python-list@python.org>
References: <87r0a8xskb.fsf@rpi3>
 <bb82f035-45dc-4c6f-aaec-b1e59ce825f7@tompassin.net>
 <0fec5175-e2a2-407a-9e09-c6901617b75c@DancesWithMice.info>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Cancel-Lock: sha1:85XVYQws5nLDRmkuXwNI6MAmGmo= sha256:Fq1f7XMbcD0QD2DxHt7yPNiua53olQINvUcWQOdH/sA=
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <bb82f035-45dc-4c6f-aaec-b1e59ce825f7@tompassin.net>
Precedence: list
Bytes: 10901

On 29/08/24 10:32, Thomas Passin via Python-list wrote:
> On 8/28/2024 5:09 PM, Daniel via Python-list wrote:
>> As you all have seen on my intro post, I am in a project using Python
>> (which I'm learning as I go) using the wikimedia API to pull data from
>> wiktionary.org. I want to parse the json and output, for now, just the
>> definition of the word.
>>
>> Wiktionary is wikimedia's dictionary.
>>
>> My requirements for v1
>>
>> Query the api for the definition for table (in the python script).
>> Pull the proper json
>> Parse the json
>> output the definition only


> You need to check at each part of the code to see if you are getting or 
> producing what you think you are.  You also should create a text 
> constant containing the JSON input you expect to get.  Make sure you can 
> process that.  Start simple - one main item.  Then two main items.  Then 
> two main items with one sub item.  And so on.
> 
> I'm not sure what you want to produce in the end but this seems awfully 
> complex to be starting with.  Also you aren't taking advantage of the 
> structure inherent in the JSON.  If the data response isn't too big, you 
> can probably take it as is and use the Python JSON reader to produce a 
> Python data structure.  It should be much easier (and faster) to process 
> the data structure than to repeatedly scan all those lines of data with 
> regexes.


Good effort so far!


Further to @Thomas: the code does seem to be taking the long way around! 
How can we illustrate that, and improve life?


The Wiktionary docs at https://developer.wikimedia.org/use-content/ 
discuss how to use their "Developer Portal". Worth reading!

As part of the above, we find the "API:Data formats" page 
(https://www.mediawiki.org/wiki/API:Data_formats) which offers a simple 
example (more simple than your objectives):

api.php?action=query&titles=Main%20page&format=json

which produces:

{
   "query": {
     "pages": {
       "217225": {
         "pageid": 217225,
         "ns": 0,
         "title": "Main page"
       }
     }
   }
}

Does this look like a Python dict[ionary's] output to you?

It is, (more discussion at the web.ref)
- but it is wrapped into a JSON payload.

There are various ways of dealing with JSON-formatted data. You're 
already using requests. Perhaps leave such research until later.


So, as soon as "page_data" is realised from "response", print() it (per 
above: make sure you're actually seeing what you're expecting to see). 
Computers have this literal habit of doing what we ask, not what we want!

PS the pprint/pretty printer library offers a neater way of outputting a 
"nested" data-structure (https://docs.python.org/3/library/pprint.html).


Thereafter, make as much use of the returned dict/list structure as can. 
At each stage of the 'drilling-down' process, again, print() it (to make 
sure ...)


In this way the code will step-through the various 'layers' of 
data-organisation. That observation and stepping-through of 'layers' is 
a hint that the code should (probably) also be organised by 'layer'! For 
example, the first for-loop finds a page which matches the search-key. 
This could be abstracted into a (well-named) function.

Thus, you can write a test-harness which provides the function with some 
sample input (which you know from earlier print-outs!) and can ensure 
(with yet another print()) that the returned-result is as-expected!

NB the test-data and check-print() should be outside the function. 
Please take these steps as-read or as 'rules'. Once your skills expand, 
you will likely become ready to learn about unit-testing, pytest, etc. 
At which time, such ideas will 'fall into place'.


BTW/whilst that 'unit' is in-focus: how many times will the current code 
========== REMAINDER OF ARTICLE TRUNCATED ==========