Deutsch   English   Français   Italiano  
<103cl86$16hvn$1@solani.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!news.szaf.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: Mild Shock <janburse@fastmail.fm>
Newsgroups: comp.lang.python
Subject: What does the Async Detour usually cost (Was: What does stats = await
 asyncio.to_thread(os.stat, url) do?)
Date: Tue, 24 Jun 2025 00:42:14 +0200
Message-ID: <103cl86$16hvn$1@solani.org>
References: <102isqb$3v5j0$2@dont-email.me> <102kp89$q8c6$1@solani.org>
 <103bdq4$15ut7$1@solani.org> <103cklq$16hos$1@solani.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 23 Jun 2025 22:42:14 -0000 (UTC)
Injection-Info: solani.org;
	logging-data="1263607"; mail-complaints-to="abuse@news.solani.org"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101
 Firefox/128.0 SeaMonkey/2.53.21
Cancel-Lock: sha1:VJP1K2nx9KLNsIY2Je6796ZA9uE=
In-Reply-To: <103cklq$16hos$1@solani.org>
X-User-ID: eJwFwYEBwCAIA7CXYNAq5wjW/09YgqBzVhJMPLyDUjnSxmSnFZysZffr0aoPOrsrd1x/JtHhweq5OZpKxQ9VTxXG

Hi,

I have some data what the Async Detour usually
costs. I just compared with another Java Prolog
that didn't do the thread thingy.

Reported measurement with the async Java Prolog:

 > JDK 24: 50 ms (using Threads, not yet VirtualThreads)

New additional measurement with an alternative Java Prolog:

JDK 24: 30 ms (no Threads)

But already the using Threads version is quite optimized,
it basically reuse its own thread and uses a mutex
somewhere, so it doesn't really create a new secondary

thread, unless a new task is spawn. Creating a 2nd thread
is silly if task have their own thread. This is the
main potential of virtual threads in upcoming Java,

just run tasks inside virtual threads.

Bye

P.S.: But I should measure with more files, since
the 50 ms and 30 ms are quite small. Also I am using a
warm run, so the files and their meta information is already

cached in operating system memory. I am trying to only
measure the async overhead, but maybe Python doesn't trust
the operating system memory, and calls some disk

sync somewhere. I don't know. I don't open and close the
files, and don't call some disk syncing. Only reading
stats to get mtime and doing some comparisons.

Mild Shock schrieb:
> So what does:
> 
> stats = await asyncio.to_thread(os.stat, url)
> 
> Whell it calls in a sparate new secondary thread:
> 
> os.stat(url)
> 
> It happends that url is only a file path, and
> the file path points to an existing file. So the
> secondary thread computs the stats, and terminates,
> 
> and the async framework hands the stats back to
> the main thread that did the await, and the main
> thread stops his waiting and continues to run
> 
> cooperatively with the other tasks in the current
> event loop. The test case measures the wall time.
> The results are:
> 
>  > node.js: 10 ms (usual Promises and stuff)
>  > JDK 24: 50 ms (using Threads, not yet VirtualThreads)
>  > pypy: 2000 ms
> 
> I am only using one main task, sequentially on
> such await calles, with a couple of file, not
> more than 50 files.
> 
> I could compare with removing the async detour,
> to qualify the async I/O detour overhead.
> 
> Mild Shock schrieb:
>> Hi,
>>
>> async I/O in Python is extremly disappointing
>> and an annoying bottleneck.
>>
>> The problem is async I/O via threads is currently
>> extremly slow. I use a custom async I/O file property
>> predicate. It doesn't need to be async for file
>>
>> system access. But by some historical circumstances
>> I made it async since the same file property routine
>> might also do a http HEAD request. But what I was
>>
>> testing and comparing was a simple file system access
>> inside a wrapped thread, that is async awaited.
>> Such a thread is called for a couple of directory
>>
>> entries to check a directory tree whether updates
>> are need. Here some measurement doing this simple
>> involving some little async I/O:
>>
>> node.js: 10 ms (usual Promises and stuff)
>> JDK 24: 50 ms (using Threads, not yet VirtualThreads)
>> pypy: 2000 ms
>>
>> So currently PyPy is 200x times slower than node.js
>> when it comes to async I/O. No files were read or
>> written in the test case, only "mtime" was read,
>>
>> via this Python line:
>>
>> stats = await asyncio.to_thread(os.stat, url)
>>
>> Bye
>>
>> Mild Shock schrieb:
>>>
>>> Concerning virtual threads the only problem
>>> with Java I have is, that JDK 17 doesn't have them.
>>> And some linux distributions are stuck with JDK 17.
>>>
>>> Otherwise its not an idea that belongs solely
>>> to Java, I think golang pioniered them with their
>>> goroutines. I am planning to use them more heavily
>>>
>>> when they become more widely available, and I don't
>>> see any principle objection that Python wouldn't
>>> have them as well. It would make async I/O based
>>>
>>> on async waithing for a thread maybe more lightweight.
>>> But this would be only important if you have a high
>>> number of tasks.
>>>
>>> Lawrence D'Oliveiro schrieb:
>>>> Short answer: no.
>>>>
>>>> <https://discuss.python.org/t/add-virtual-threads-to-python/91403>
>>>>
>>>> Firstly, anybody appealing to Java as an example of how to design a
>>>> programming language should immediately be sending your bullshit 
>>>> detector
>>>> into the yellow zone.
>>>>
>>>> Secondly, the link to a critique of JavaScript that dates from 2015, 
>>>> from
>>>> before the language acquired its async/await constructs, should be 
>>>> another
>>>> warning sign.
>>>>
>>>> Looking at that Java spec, a “virtual thread” is just another name for
>>>> “stackful coroutine”. Because that’s what you get when you take away
>>>> implicit thread preemption and substitute explicit preemption instead.
>>>>
>>>> The continuation concept is useful in its own right. Why not 
>>>> concentrate
>>>> on implementing that as a new primitive instead?
>>>>
>>>
>>
>