Path: ...!feeds.phibee-telecom.net!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Anton Shepelev Newsgroups: comp.lang.c,sci.stat.math Subject: Re: realloc() - frequency, conditions, or experiences about relocation? Date: Mon, 17 Jun 2024 18:02:49 +0300 Organization: A noiseless patient Spider Lines: 49 Message-ID: <20240617180249.96dfaafa89392827aa162434@g{oogle}mail.com> References: <875xu8vsen.fsf@bsb.me.uk> <87zfrjvqp6.fsf@bsb.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Mon, 17 Jun 2024 17:02:50 +0200 (CEST) Injection-Info: dont-email.me; posting-host="e4efe3e0e0736d482d811e40a75979fd"; logging-data="765209"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+qs70P/WDPq5rQMWzxksVuDUCmf03wz6Y=" Cancel-Lock: sha1:YDo8xQBFeGxeLQcXyFO0Ml++fwE= X-Newsreader: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32) Bytes: 3207 [cross-posted to: ci.stat.math] Malcolm McLean: > We have a continuously growing buffer, and we want the > best strategy for reallocations as the stream of > characters comes at us. So, given we now how many > characters have arrived, can we predict how many will > arrive, Do you mean in the next bunch, or in total (till the end of the buffer's lifetime)? > and therefore ask for the best amount when we reallocate, > so that we neither make too many reallocation (reallocate > on every byte received) or ask for too much (demand > SIZE_MAX memory when the first byte is received).? > > Your strategy for avoiding these extremes is exponential > growth. You allocate a small amount for the first few > bytes. Then you use exponential growth, with a factor of > ether 2 or 1.5. This strategy ensures a constant ratio between the amount of reallocated data to the length of the buffer by making reallocations less frequent as the buffer grows. > And so we integrate the distribution between the point we > are at and infinity. Then we tkae the mean. And that gives > us a best estimate of how many bytes are to come, and > therefore how much to grow the buffer by. You have an apriori distribution of the buffer size (can be tracked on-the-fly, if unknown beforehand) and a partially filled buffer. The task is to calculate the a-posteriori distribution of /that/ buffer's final size, and then to allocate the predicted value based on a good percentile. How about using a percentile instead of the mean, e.g. if the current size corresponds to percentile p, you allocate a capacity corresponding to percentile 1-(1-p)/k , where k>1 denotes the balance between space and time efficency. For example, if the 60th percentile of the buffer is required and k=2, you allocate a capacity sufficient to hold 100-(100-60)/2=80% of buffers. -- () ascii ribbon campaign -- against html e-mail /\ www.asciiribbon.org -- against proprietary attachments