Deutsch   English   Français   Italiano  
<v2r9br$2hva2$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bart <bc@freeuk.com>
Newsgroups: comp.lang.c
Subject: Re: Good hash for pointers
Date: Sat, 25 May 2024 00:54:34 +0100
Organization: A noiseless patient Spider
Lines: 141
Message-ID: <v2r9br$2hva2$1@dont-email.me>
References: <v2n88p$1nlcc$1@dont-email.me>
 <v2qm8m$2el55$1@raubtier-asyl.eternal-september.org>
 <v2qnue$2evlu$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 25 May 2024 01:54:35 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="a9f17f030c74c0000dda6058d5c2eb70";
	logging-data="2686274"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/E7wjSD5+ElEh7Li8qmOer"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:3Wek4szSBlB7kC0TdK7nIFLLhw0=
Content-Language: en-GB
In-Reply-To: <v2qnue$2evlu$1@dont-email.me>
Bytes: 4740

On 24/05/2024 19:57, Malcolm McLean wrote:
> On 24/05/2024 19:28, Bonita Montero wrote:
>> Am 23.05.2024 um 13:11 schrieb Malcolm McLean:
>>>
>>> What is a good hash function for pointers to use in portable ANSI C?
>>>
>>> The pointers are nodes of a tree, which are read only, and I want to 
>>> associate read/write data with them. So potentially a lage number of 
>>> pointers,and they might be consecutively ordered if they are taken 
>>> from an array, or they might be returned from repeared calls to 
>>> malloc() with small allocations. Obviously I have no control over 
>>> pointer size or internal representation.
>>>
>>
>> Use FNV.
>>
> 
> Here's an attempt.
> 
> /* FNV hash of a pointer */
> static unsigned int hash(void *address)
> {
>      int i;
>      unsigned long answer = 2166136261;
>      unsigned char *byte = (unsigned char *) &address;
> 
>      for (i = 0; i < sizeof(void *); i++)
>      {
>          answer *= 16777619;
>          answer ^= byte[i];
>      }
>      return (unsigned int) (answer & 0xFFFFFFFF);
> }
> 
> Now what will compilers make of that?

Compiler, or performance?

I tried this function with the test program shown below. I used it to 
populate a hash table of 64K entries with pointers from successive calls 
to malloc.

Results, in terms of clashes, for different numbers N of entries 
populated out of 64K were:

  10000     1100
  30000    12000
  50000    67000
  60000   216000
  65535  5500000    (largest N possible)

Result were rather variable as malloc produces different patterns of 
pointers on different runs. These were simply the result from the first run.

Was this good? I'd no idea. But as a comparison, I used my own hash 
function, normally used to hash identifiers, shown below the main 
program as the function 'bchash'.

If this is subsituted instead, the results were:

  10000      230
  30000     3800
  50000    10300
  60000    50300
  65535  2700000

Hash tables need a certain amount of free capacity to stay efficient, so 
3/4 full (about 50K/64K) is about right.

Again I don't know if these figures are good, they are just better than 
from your hash() function, for the inputs I used, within this test, and 
for this size of table.

No doubt there are much better ones.



------------------------------------------
#include <stdio.h>
#include <stdlib.h>

static unsigned int hash(void *address)
{
     int i;
     unsigned long answer = 2166136261;
     unsigned char *byte = (unsigned char *) &address;

     for (i = 0; i < sizeof(void *); i++)
     {
         answer *= 16777619;
         answer ^= byte[i];
     }
     return (unsigned int) (answer & 0xFFFFFFFF);
}

void* table[65536];

int main(void) {
     void* p;

     int clashes=0, wrapped;
     int j;

     for (int i=0; i<30000; ++i) {
         p = malloc(1);
         j = hash(p) & 65535;

         wrapped=0;
         while (table[j]) {
             ++clashes;
             ++j;
             if (j>65535) {
                 if (wrapped) { puts("Table full"); exit(1);}
                 wrapped=1;
                 j=0;
             }
         }
         table[j] = p;

     }
     printf("Clashes %d\n", clashes);
}


------------------------------------------

static unsigned int bchash(void *address)
{
     int i;
     unsigned long hsum = 0;
     unsigned char *byte = (unsigned char *) &address;

     for (i = 0; i < sizeof(void *); i++)   {
         hsum = (hsum<<4) - hsum + byte[i];
     }
     return (hsum<<5) - hsum;
}