Path: ...!feeds.phibee-telecom.net!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Malcolm McLean Newsgroups: comp.lang.c Subject: Re: realloc() - frequency, conditions, or experiences about relocation? Date: Mon, 17 Jun 2024 10:31:59 +0100 Organization: A noiseless patient Spider Lines: 35 Message-ID: References: <875xu8vsen.fsf@bsb.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Mon, 17 Jun 2024 11:31:59 +0200 (CEST) Injection-Info: dont-email.me; posting-host="1d441ad42b482865d1a884aec9340332"; logging-data="628282"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+A6CVh/+JuwpPEkqndZUxwE26JA/zJbJI=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:AUcTheegqU4BMNKip5ZjWBIQ+58= Content-Language: en-GB In-Reply-To: <875xu8vsen.fsf@bsb.me.uk> Bytes: 2885 On 17/06/2024 10:18, Ben Bacarisse wrote: > Janis Papanagnou writes: > >> In a recent thread realloc() was a substantial part of the discussion. >> "Occasionally" the increased data storage will be relocated along >> with the previously stored data. On huge data sets that might be a >> performance factor. Is there any experience or are there any concrete >> factors about the conditions when this relocation happens? - I could >> imagine that it's no issue as long as you're in some kB buffer range, >> but if, say, we're using realloc() to substantially increase buffers >> often it might be an issue to consider. It would be good to get some >> feeling about that internal. > > There is obviously a cost, but there is (usually) no alternative if > contiguous storage is required. In practice, the cost is usually > moderate and can be very effectively managed by using an exponential > allocation scheme: at every reallocation multiply the storage space by > some factor greater than 1 (I often use 3/2, but doubling is often used > as well). This results in O(log(N)) rather than O(N) allocations as in > your code that added a constant to the size. Of course, some storage is > wasted (that /might/ be retrieved by a final realloc down to the final > size) but that's rarely significant. > So can we work it out? Let's assume for the moment that the allocations have a semi-normal distribution, with negative values disallowed. Now ignoring the first few values, if we have allocated, say, 1K, we ought to be able to predict the value by integrating the distribution from 1k to infinity and taking the mean. -- Check out my hobby project. http://malcolmmclean.github.io/babyxrc