Deutsch   English   Français   Italiano  
<v2ok97$1vp4n$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Malcolm McLean <malcolm.arthur.mclean@gmail.com>
Newsgroups: comp.lang.c
Subject: Re: Good hash for pointers
Date: Fri, 24 May 2024 00:42:29 +0100
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <v2ok97$1vp4n$1@dont-email.me>
References: <v2n88p$1nlcc$1@dont-email.me> <v2nkbq$1pt1p$1@dont-email.me>
 <875xv43zwu.fsf@nosuchdomain.example.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 24 May 2024 01:42:31 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="438aee3945dcdba991233edefd290899";
	logging-data="2090135"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/5MH1MBg1T+C+rm5cTkmLCEt1XfwBlXIs="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:R+S9xhMceD+w1Co6HuNpdTWH+8k=
Content-Language: en-GB
In-Reply-To: <875xv43zwu.fsf@nosuchdomain.example.com>
Bytes: 2815

On 23/05/2024 23:51, Keith Thompson wrote:
> Richard Harnden <richard.nospam@gmail.invalid> writes:
>> On 23/05/2024 12:11, Malcolm McLean wrote:
>>> What is a good hash function for pointers to use in portable ANSI C?
>>
>> All your pointers from malloc are going to be unique.
>> All of their low bits are going to be zero, because they need to align
>> on some n-bit boundary.
>> All of their high bit are going to be the same, because that's just
>> how it works.
>>
>> So just take the middle:  hash = ((intptr_t) ptr) >> 4 & 0xffff;
> 
> I'd use uintptr_t, not intptr_t.  I prefer not to think about how
> bitwise operations work on signed values unless I have to.
> 
> You're assuming a 16-bit hash.  I have no idea whether that suits
> Malcolm's requirements.
> 
> Another approach might be to divide the pointer representation into
> N-bit chunks and xor them together.
> 
I have a tree structure which represents an XML document. So potentaiily 
very large, though the code is for the Baby X resource compiler, and 
people are not want to embed absolutely massive XML data files into C 
source.

I'm implemententing XPath querys for it, and my alorithm requires some 
workspace data, actually just a boolean flag, associated with each node. 
So I chose a hash table as the most efficent way of providing the flag, 
without having to touch the XMLNODE structure itself.

The current parser allocates the nodes with repeated small requests to 
malloc(). However of course this is the sort of thing that could be 
optimised to use a fixed block allocator. The XPath module shouldn't 
break as result.

-- 
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm