| Deutsch English Français Italiano |
|
<86ikr5dexw.fsf@linuxsc.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch <tr.17687@z991.linuxsc.com>
Newsgroups: comp.lang.c++
Subject: Re: Bignum multiplication in Python vs GMP (was: constexpr is really very smart!)
Date: Thu, 26 Dec 2024 18:42:35 -0800
Organization: A noiseless patient Spider
Lines: 61
Message-ID: <86ikr5dexw.fsf@linuxsc.com>
References: <vjndub$2glcu$1@paganini.bofh.team> <20241216112808.00003f74@yahoo.com> <86jzbyghdw.fsf@linuxsc.com> <20241218013342.0000518a@yahoo.com> <20241218135131.00000006@yahoo.com> <86zfksf4lb.fsf@linuxsc.com> <20241220021703.00003e9e@yahoo.com> <86msgqezb8.fsf@linuxsc.com> <20241222000708.000007ef@yahoo.com> <86a5cnfubs.fsf@linuxsc.com> <20241223174708.000062b3@yahoo.com> <86ttatcj2x.fsf@linuxsc.com> <20241225110505.00001733@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Fri, 27 Dec 2024 03:42:36 +0100 (CET)
Injection-Info: dont-email.me; posting-host="ce9c96fed16de83e8c7d4f3eabb78d96";
logging-data="3431812"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/NMFjk8HcmxaOqn5lokGhHHyRI8I8uapU="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:Zand8r5Ay8uc6E7JuMxViltwhsw=
sha1:r7962lrcsv/Lq1SYz3rbMMP8Q4I=
Michael S <already5chosen@yahoo.com> writes:
> On Tue, 24 Dec 2024 05:21:42 -0800
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>
>>> [python runs much slower than GMP]
>>
>> Yes, I have no explanation either. Your reasoning looks sound to
>> me.
>
> I did few more time measurements both with GMP and with Python. It
> clearly showed that they use not just different implementations of
> bignum multiplication, but different algorithms.
> GMP multiplication for operands in range of above ~200,000 bits is
> approximately O(N*log(N)) with number of bits, or just a little bit
> slower than that.
> Python multiplication scale much worse, approximately as (N**1.55).
That's consistent with using Karatsuba: log2( 3 ) =~ 1.585.
> And indeed after binging more I had seen the following discussion on
> reddit: [link]
>
> A relevant quote:
> "Python seems to use two algorithms for bignum multiplication: "the
> O(N**2) school algorithm" and Karatsuba if both operands contain more
> than 70 decimal digits. (Your example reaches 70 decimal digits already
> after the 5th(?) squaring, and after the 22nd there are ~17million
> decimal digits(?). The wast majority of the time is spent on the last
> one or two multiplications i.e. Karatsuba.)
>
> Meanwhile GMP implements seven multiplication algorithms: The same(?)
> "school" and Karatsuba algorithms, plus various Toom and an FFT method
> (Strassen), but I'm unclear on what the exact thresholds are that
> select these. It seems these thresholds are in "limbs" ("the part of a
> multi-precision number that fits in a single machine word") instead of
> decimal digits, with e.g. FFT method being used after "somewhere
> between 3000 and 10000 limbs", i.e. >~30k - 100k(?) decimal digits, so
> it would probably be used for the relevant steps in your example."
That ratio suggests the limbs are 32 bits (ie, roughly 10 digits each).
> It seem that that threshold for FFT method mentioned in the quote
> applies to number of limbs (i.e. 64-bit groups of bits in our specific
> case) in the product rather than in the operands. Also it is possible
> that in the newer versions of GMP the threshold is a lower than it
> was 7 years ago.
I have no intuition regarding whether a threshold would be chosen
based on the sizes of the operands or the sizes of the product. For
this particular problem it's only one iteration difference anyway.
> I wonder, why python uses its own thing instead of GMP.
> Incompatible licenses or plain NIH ?
Perhaps (1) python doesn't want to be bound by the rather severe
terms of a GNU license, or (2) historical inertia.
Let me add that I appreciate your excellent investigation results.
They reinforce my view that it was wise for me to cease the deep
dives into performance issues. :)