Deutsch   English   Français   Italiano  
<vmds6o$32ji$2@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.arch
Subject: Re: Segments
Date: Fri, 17 Jan 2025 16:15:36 +0100
Organization: A noiseless patient Spider
Lines: 82
Message-ID: <vmds6o$32ji$2@dont-email.me>
References: <vdlgl9$3kq50$2@dont-email.me> <vdtmv9$16lu8$1@dont-email.me>
 <2024Oct6.150415@mips.complang.tuwien.ac.at>
 <vl7m2b$6iat$1@paganini.bofh.team>
 <2025Jan3.093849@mips.complang.tuwien.ac.at>
 <vlcddh$j2gr$1@paganini.bofh.team>
 <2025Jan5.121028@mips.complang.tuwien.ac.at>
 <vleuou$rv85$1@paganini.bofh.team>
 <ndamnjpnt8pkllatkdgq9qn2turaao1f0a@4ax.com>
 <2025Jan6.092443@mips.complang.tuwien.ac.at> <vlgreu$1lsr9$1@dont-email.me>
 <vlhjtm$1qrs5$1@dont-email.me> <bdZeP.23664$Hfb1.16566@fx46.iad>
 <vlj1pg$25p0e$1@dont-email.me> <87cygo97dl.fsf@nosuchdomain.example.com>
 <vm7mvi$2rr87$1@dont-email.me> <vmaig9$3ehn7$1@dont-email.me>
 <vmbesc$3d6n7$1@dont-email.me> <vmblm4$3kno9$1@dont-email.me>
 <87zfjq73gh.fsf@nosuchdomain.example.com>
 <84d10252cc7b0435cd5f4d6232397371@www.novabbs.org>
 <87v7ue6ykc.fsf@nosuchdomain.example.com>
 <17e99d7483b1c62954212fe599a8cb95@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 17 Jan 2025 16:15:37 +0100 (CET)
Injection-Info: dont-email.me; posting-host="eac6ea3806850550cb7a14bb1d188128";
	logging-data="100978"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18X6EnzzhWrt8yPnkCt5gWsBhOtMJwQQ6U="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.11.0
Cancel-Lock: sha1:YC2J1tHJ83+jzx7RL0G4U1C3dFQ=
Content-Language: en-GB
In-Reply-To: <17e99d7483b1c62954212fe599a8cb95@www.novabbs.org>
Bytes: 4508

On 17/01/2025 03:10, MitchAlsup1 wrote:
> On Fri, 17 Jan 2025 1:04:03 +0000, Keith Thompson wrote:
> 
>> mitchalsup@aol.com (MitchAlsup1) writes:
>>> On Thu, 16 Jan 2025 23:18:22 +0000, Keith Thompson wrote:
>>>> Terje Mathisen <terje.mathisen@tmsw.no> writes:
>>>> [...]
>>>>> I do know that several people have created fast string libraries,
>>>>> where any string that is short enough ends up entirely inside the dope
>>>>> vector, so no heap allocation.
>>>>
>>>> Some implementations of C++ std::string do this.  For example, the GNU
>>>> implementation appears to store up to 16 characters (including the
>>>> trailing null character) in the std::string object.
>>>
>>> Why use an 8-byte pointer to store a string 16 or fewer bytes long ? !!
>>
>> I don't understand.  What pointer are you referring to?
> 
> The pointer which would have had to point elsewhere had the string
> not been contained within.
> 

There are a couple of ways you can do "small string optimisation".  One 
would be to have a structure something like this :

struct String1 {
	size_t capacity;
	char * data;
	char small_string[16];
}

Then "data" would point to "small_string" for a capacity of 16, and if 
that's not enough, use malloc to allocate more space.


An alternative would be to have something like this (I'm being /really/ 
sloppy with alignments, rules for unions, and so on - this is 
illustrative only, not real code!) :

struct String2 {
	bool is_small;
	union {
		char small_string[31];
		struct {
			size_t capacity;
			char * data;
		}
	}
}

This second version lets you put more characters in the local 
small_string area, reusing space that would otherwise be used for the 
pointer and capacity.  But it has more runtime overhead when using the 
string :

	void print1(String1 s) {
		printf(s.data);
	}

	void print2(String2 s) {
		if (s.is_small) {
			printf(s.small_string);
		} else {
			printf(s.data);
		}
	}

There are, of course, many other ways to make string types (such as 
supporting copy-on-write), but I suspect that Mitch is thinking of style 
String2 while Keith is thinking of style String1.



>> In the implementation I'm referring to, std::string happens to be 32
>> bytes in size.  If the string has a length of 15 or less, the string
>> data is stored directly in the std::string object (in the last 16 bytes
>> as it happens).  If the string is longer than that it's stored
>> elsewhere, and that 16 bytes is presumably use to manage the
>> heap-allocated data.