Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Malcolm McLean Newsgroups: comp.lang.c Subject: Re: Good hash for pointers Date: Fri, 24 May 2024 00:42:29 +0100 Organization: A noiseless patient Spider Lines: 41 Message-ID: References: <875xv43zwu.fsf@nosuchdomain.example.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Fri, 24 May 2024 01:42:31 +0200 (CEST) Injection-Info: dont-email.me; posting-host="438aee3945dcdba991233edefd290899"; logging-data="2090135"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/5MH1MBg1T+C+rm5cTkmLCEt1XfwBlXIs=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:R+S9xhMceD+w1Co6HuNpdTWH+8k= Content-Language: en-GB In-Reply-To: <875xv43zwu.fsf@nosuchdomain.example.com> Bytes: 2815 On 23/05/2024 23:51, Keith Thompson wrote: > Richard Harnden writes: >> On 23/05/2024 12:11, Malcolm McLean wrote: >>> What is a good hash function for pointers to use in portable ANSI C? >> >> All your pointers from malloc are going to be unique. >> All of their low bits are going to be zero, because they need to align >> on some n-bit boundary. >> All of their high bit are going to be the same, because that's just >> how it works. >> >> So just take the middle: hash = ((intptr_t) ptr) >> 4 & 0xffff; > > I'd use uintptr_t, not intptr_t. I prefer not to think about how > bitwise operations work on signed values unless I have to. > > You're assuming a 16-bit hash. I have no idea whether that suits > Malcolm's requirements. > > Another approach might be to divide the pointer representation into > N-bit chunks and xor them together. > I have a tree structure which represents an XML document. So potentaiily very large, though the code is for the Baby X resource compiler, and people are not want to embed absolutely massive XML data files into C source. I'm implemententing XPath querys for it, and my alorithm requires some workspace data, actually just a boolean flag, associated with each node. So I chose a hash table as the most efficent way of providing the flag, without having to touch the XMLNODE structure itself. The current parser allocates the nodes with repeated small requests to malloc(). However of course this is the sort of thing that could be optimised to use a fixed block allocator. The XPath module shouldn't break as result. -- Check out Basic Algorithms and my other books: https://www.lulu.com/spotlight/bgy1mm