Path: ...!fu-berlin.de!uni-berlin.de!not-for-mail From: ram@zedat.fu-berlin.de (Stefan Ram) Newsgroups: comp.lang.c++,comp.lang.c Subject: Re: Threads across programming languages Date: 30 Apr 2024 09:04:48 GMT Organization: Stefan Ram Lines: 55 Expires: 1 Feb 2025 11:59:58 GMT Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: news.uni-berlin.de YzVniy7QGjnCDQnsqIdP9Qq2m7efUcjSjUrcbKbyWb+rXg Cancel-Lock: sha1:jYX34LR4HlT87Jj1mkrI4nyuCiE= sha256:OA7Q7c/KBaQAF6VZEDZJSMLREWeq+0ZIg5kfeWIejBc= X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved. Distribution through any means other than regular usenet channels is forbidden. It is forbidden to publish this article in the Web, to change URIs of this article into links, and to transfer the body without this notice, but quotations of parts in other Usenet posts are allowed. X-No-Archive: Yes Archive: no X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some services to mirror the article in the web. But the article may be kept on a Usenet archive server with only NNTP access. X-No-Html: yes Content-Language: en-US Bytes: 3652 ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted: >The GIL only prevents multiple Python statements from being >interpreted simultaneously, but if you're waiting on inputs (like >sockets), it's not active, so that could be distributed across >multiple cores. Disclaimer: This is not on-topic here as it discusses Python, not C or C++. FWIW, here's some multithreaded Python code modeled after what I use in an application. I am using Python to prepare a press review for me, getting article headers from several newssites, removing all headers matching a list of regexps, and integrating everything into a single HTML resource. (I do not like to read about Lindsay Lohan, for example, so articles with the text "Lindsay Lohan" will not show up on my HTML review.) I'm usually downloading all pages at once using Python threads, which will make sure that a thread uses the CPU while another thread is waiting for TCP/IP data. This is the code, taken from my Python program and a bit simplified: from multiprocessing.dummy import Pool .... with Pool( 9 if fast_internet else 1 )as pool: for i in range( 9 ): content[ i ] = pool.apply_async( fetch,[ uris[ i ] ]) pool.close() pool.join() . I'm using my "fetch" function to fetch a single URI, and the loop starts nine threads within a thread pool to fetch the content of those nine URIs "in parallel". This is observably faster than corresponding sequential code. (However, sometimes I have a slow connection and have to download sequentially in order not to overload the slow connection, which would result in stalled downloads. To accomplish this, I just change the "9" to "1" in the first line above.) In case you wonder about the "dummy": |The multiprocessing.dummy module module provides a wrapper |for the multiprocessing module, except implemented using |thread-based concurrency. | |It provides a drop-in replacement for multiprocessing, |allowing a program that uses the multiprocessing API to |switch to threads with a single change to import statements. . So, this is an area where multithreading the Python way is easy to use and enhances performance even in the presence of the GIL!