| Deutsch English Français Italiano |
|
<103cl86$16hvn$1@solani.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!news.szaf.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail From: Mild Shock <janburse@fastmail.fm> Newsgroups: comp.lang.python Subject: What does the Async Detour usually cost (Was: What does stats = await asyncio.to_thread(os.stat, url) do?) Date: Tue, 24 Jun 2025 00:42:14 +0200 Message-ID: <103cl86$16hvn$1@solani.org> References: <102isqb$3v5j0$2@dont-email.me> <102kp89$q8c6$1@solani.org> <103bdq4$15ut7$1@solani.org> <103cklq$16hos$1@solani.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Mon, 23 Jun 2025 22:42:14 -0000 (UTC) Injection-Info: solani.org; logging-data="1263607"; mail-complaints-to="abuse@news.solani.org" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101 Firefox/128.0 SeaMonkey/2.53.21 Cancel-Lock: sha1:VJP1K2nx9KLNsIY2Je6796ZA9uE= In-Reply-To: <103cklq$16hos$1@solani.org> X-User-ID: eJwFwYEBwCAIA7CXYNAq5wjW/09YgqBzVhJMPLyDUjnSxmSnFZysZffr0aoPOrsrd1x/JtHhweq5OZpKxQ9VTxXG Hi, I have some data what the Async Detour usually costs. I just compared with another Java Prolog that didn't do the thread thingy. Reported measurement with the async Java Prolog: > JDK 24: 50 ms (using Threads, not yet VirtualThreads) New additional measurement with an alternative Java Prolog: JDK 24: 30 ms (no Threads) But already the using Threads version is quite optimized, it basically reuse its own thread and uses a mutex somewhere, so it doesn't really create a new secondary thread, unless a new task is spawn. Creating a 2nd thread is silly if task have their own thread. This is the main potential of virtual threads in upcoming Java, just run tasks inside virtual threads. Bye P.S.: But I should measure with more files, since the 50 ms and 30 ms are quite small. Also I am using a warm run, so the files and their meta information is already cached in operating system memory. I am trying to only measure the async overhead, but maybe Python doesn't trust the operating system memory, and calls some disk sync somewhere. I don't know. I don't open and close the files, and don't call some disk syncing. Only reading stats to get mtime and doing some comparisons. Mild Shock schrieb: > So what does: > > stats = await asyncio.to_thread(os.stat, url) > > Whell it calls in a sparate new secondary thread: > > os.stat(url) > > It happends that url is only a file path, and > the file path points to an existing file. So the > secondary thread computs the stats, and terminates, > > and the async framework hands the stats back to > the main thread that did the await, and the main > thread stops his waiting and continues to run > > cooperatively with the other tasks in the current > event loop. The test case measures the wall time. > The results are: > > > node.js: 10 ms (usual Promises and stuff) > > JDK 24: 50 ms (using Threads, not yet VirtualThreads) > > pypy: 2000 ms > > I am only using one main task, sequentially on > such await calles, with a couple of file, not > more than 50 files. > > I could compare with removing the async detour, > to qualify the async I/O detour overhead. > > Mild Shock schrieb: >> Hi, >> >> async I/O in Python is extremly disappointing >> and an annoying bottleneck. >> >> The problem is async I/O via threads is currently >> extremly slow. I use a custom async I/O file property >> predicate. It doesn't need to be async for file >> >> system access. But by some historical circumstances >> I made it async since the same file property routine >> might also do a http HEAD request. But what I was >> >> testing and comparing was a simple file system access >> inside a wrapped thread, that is async awaited. >> Such a thread is called for a couple of directory >> >> entries to check a directory tree whether updates >> are need. Here some measurement doing this simple >> involving some little async I/O: >> >> node.js: 10 ms (usual Promises and stuff) >> JDK 24: 50 ms (using Threads, not yet VirtualThreads) >> pypy: 2000 ms >> >> So currently PyPy is 200x times slower than node.js >> when it comes to async I/O. No files were read or >> written in the test case, only "mtime" was read, >> >> via this Python line: >> >> stats = await asyncio.to_thread(os.stat, url) >> >> Bye >> >> Mild Shock schrieb: >>> >>> Concerning virtual threads the only problem >>> with Java I have is, that JDK 17 doesn't have them. >>> And some linux distributions are stuck with JDK 17. >>> >>> Otherwise its not an idea that belongs solely >>> to Java, I think golang pioniered them with their >>> goroutines. I am planning to use them more heavily >>> >>> when they become more widely available, and I don't >>> see any principle objection that Python wouldn't >>> have them as well. It would make async I/O based >>> >>> on async waithing for a thread maybe more lightweight. >>> But this would be only important if you have a high >>> number of tasks. >>> >>> Lawrence D'Oliveiro schrieb: >>>> Short answer: no. >>>> >>>> <https://discuss.python.org/t/add-virtual-threads-to-python/91403> >>>> >>>> Firstly, anybody appealing to Java as an example of how to design a >>>> programming language should immediately be sending your bullshit >>>> detector >>>> into the yellow zone. >>>> >>>> Secondly, the link to a critique of JavaScript that dates from 2015, >>>> from >>>> before the language acquired its async/await constructs, should be >>>> another >>>> warning sign. >>>> >>>> Looking at that Java spec, a “virtual thread” is just another name for >>>> “stackful coroutine”. Because that’s what you get when you take away >>>> implicit thread preemption and substitute explicit preemption instead. >>>> >>>> The continuation concept is useful in its own right. Why not >>>> concentrate >>>> on implementing that as a new primitive instead? >>>> >>> >> >