Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!news.quux.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: "Chris M. Thomasson" Newsgroups: comp.lang.c++ Subject: Re: signalling a condvar from inside vs. signalling a condvar von outside Date: Thu, 24 Apr 2025 13:05:08 -0700 Organization: A noiseless patient Spider Lines: 74 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Thu, 24 Apr 2025 22:05:10 +0200 (CEST) Injection-Info: dont-email.me; posting-host="9ab23ff6e21fe9bbc243a2a98cfc6588"; logging-data="2441746"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18MRInDQXY+tebba7iRIUCdksjd2Bh3V88=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:yTodaUE5Q2/ekfo4+jUnzVp3KOA= Content-Language: en-US In-Reply-To: Bytes: 4860 On 4/22/2025 10:46 PM, Bonita Montero wrote: > Now I wrote a little program to test if there's thundering herd problem > with glibc's mutex / condvar. This it is: > > #include > #include > #include > #include > #include > #include > #include > #include > > using namespace std; > > int main() > { >     constexpr size_t N = 10'000; >     int nClients = thread::hardware_concurrency() - 1; >     mutex mtx; >     int signalled = 0; >     condition_variable cv; >     atomic_int ai( 0 ); >     binary_semaphore bs( false ); >     vector clients; >     atomic_int64_t nVoluntary( 0 ); >     for( int c = nClients; c; --c ) >         clients.emplace_back( [&] >             { >                 for( size_t r = N; r; --r ) >                 { >                     unique_lock lock( mtx ); >                     cv.wait( lock, [&] { return (bool)signalled; } ); >                     --signalled; >                     lock.unlock(); >                     if( ai.fetch_sub( 1, memory_order_relaxed ) == 1 ) >                         bs.release( 1 ); >                 } >                 rusage ru; >                 getrusage( RUSAGE_THREAD, &ru ); >                 nVoluntary.fetch_add( ru.ru_nvcsw, memory_order_relaxed ); >             } ); >     for( size_t r = N; r; --r ) >     { >         unique_lock lock( mtx ); >         signalled = nClients; >         cv.notify_all(); >         ai.store( nClients, memory_order_relaxed ); >         lock.unlock(); >         bs.acquire(); >     } >     clients.resize( 0 ); >     cout << N << " rounds," << endl; >     cout << (double)nVoluntary.load( memory_order_relaxed ) / nClients > << " context switches pe thread" << endl; > } > > It spawns one less threads than ther are hardware threads. These > all wait for a condvar and a counter which is initially the number > of threads and that must be > 0 for the wait to succeed. This counter > is decremented by each thread. Then the threads decrement an atomic > and if it becomes zero the last thread raises a semaphore, thereby > waking up the main thread. > This are the results for 10'000 rounds on a 32-thread machine: > >     10000 rounds, >     2777.29 context switches pe thread > > So there are less context-switches than rounds and there's no > thundering herd with glibc. > Sigh... Here is a challenge for you. Get it working in a race detector, say, Relacy?