Article <vsjclj$mtp$1@reader1.panix.com>

Deutsch English Français Italiano
<vsjclj$mtp$1@reader1.panix.com>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder9.news.weretis.net!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail
From: cross@spitfire.i.gajendra.net (Dan Cross)
Newsgroups: comp.arch
Subject: Re: Threads (was Re: MSI interrupts)
Date: Wed, 2 Apr 2025 13:04:51 -0000 (UTC)
Organization: PANIX Public Access Internet and UNIX, NYC
Message-ID: <vsjclj$mtp$1@reader1.panix.com>
References: <vqto79$335c6$1@dont-email.me> <vsgu3r$3blp9$1@dont-email.me> <0FTGP.567700$f81.368330@fx48.iad> <2025Apr2.082556@mips.complang.tuwien.ac.at>
Injection-Date: Wed, 2 Apr 2025 13:04:51 -0000 (UTC)
Injection-Info: reader1.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80";
	logging-data="23481"; mail-complaints-to="abuse@panix.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: cross@spitfire.i.gajendra.net (Dan Cross)
Bytes: 4854
Lines: 77

In article <2025Apr2.082556@mips.complang.tuwien.ac.at>,
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
>scott@slp53.sl.home (Scott Lurndal) writes:
>>Unix SVR4ES/MP actually had
>>both kernel threads (called lightweight processes (LWP)) and user-level
>>threads in an M-N setup (M user threads multiplexed on N kernel threads).
>>
>>Didn't turn out to be particularly useful.
>
>Maybe the SVR4ES/MP stuff was not particularly useful.  Combining
>user-level threads and kernel threads in an M-N setup has turned out
>to be very useful in, e.g., Erlang applications.

It's useful when threads are carefully managed, so that programs
can ensure that the set of OS threads is always available so that
runnable LWPs can be scheduled onto them.  This implies that the
OS-managed threads should not block, as if they do, you lose
1/M'th of your parallelism.  But the N:M model really breaks
down when LWPs become unrunnable in a way that does not involve
the user-level thread scheduler, something that can happen in
surprising places.

For instance, consider Unix/POSIX `open`: from an API
perspective this simply maps a symbolic file path name to a file
descriptor that can subsequently be used to perform IO on the
named file.  While it is well-known that the interface is
defined so that it can block opening some kinds of devices, for
example, some terminal devices until the line is asserted, that
is not the usual case, and noteably `open` does no IO on the
file itself.  So generally, most programs would expect that it
has no reason to block.

But that's just the interface, the implementation is different.
When we observe that the resolution of a name to an FD is always
synchronous with respect to the calling program, we can see that
there are a number of places where `open` can block in the
kernel because it may actually require IO: the path name
argument may point into a non-resident part of the address space
that needs to be paged into memory.  Or the per-component
lookups while walking the path may require IO to find the
corresponding directory entries in directory files, and so on.
POSIX does not provide us an alternative asynchronous `open`
call, so if an LWP opens a file, we might lose the OS-managed
thread it was running on for a while, if it becomes blocked in
the kernel waiting on IO.  About the best you can do to detect
this kind of blocking is schedule an event to send yourself a
signal after some reasonable timeout, and cancel that event if
`open` completes before the timeout expires.

Furthermore, because LWPs and OS-managed threads are usually
independent, their respective schedulers might make pathological
decisions with respect to one another.  Consider two LWPs that
are passing a token back and forth between themselves for some
reason; for good throughput, we'd want these gang-scheduled onto
OS threads, but they might bounce around running on the
available set of OS threads, and if the OS snatched the thread
that one was scheduled to while the other was running,
throughput would suffer.  The problem here is that generally,
scheduling decisions made by the LWP scheduler are not knowable
to the OS and vice-versa, even though you really want both to
cooperate in this case.

1:1 threading models solve all of these problems, at the expense
of being heavier-weight.  Languages like Erlang and Go do a
decent job in an M:N context by enforcing tight control of the
language runtime, which handles all of the squirrely details on
the program's behalf.

A lot of work has been done on M:N: Pysche, Scheduler
Activations, `rfork`, upcalls to DECthreads in VMS, etc.  More
recently, systems like Akaros proposed a different model that
punted nearly all scheduling to userspace and adopts something
that's more like M:C, where "C" is "cores", by building a
virtual-multiprocessor abstraction for many-core processes.
http://akaros.org/news.html

	- Dan C.