parrot-dev Digest, Vol 24, Issue 7

luben karavelov luben at unixsol.org
Mon Aug 9 22:01:42 UTC 2010


On  7.08.2010 00:21, chromatic wrote:
> On Friday 06 August 2010 at 13:23, Paul C wrote:
>
>> As an aside, is this doubling in size each time considered a good thing?
>> Seems okay up to about 64 or so, but then the table gets big fast. I'm sure
>> everyone knows the trick of using a vector of sizes and representing the
>> size in the hash as an index into the vector. The cool thing is that the
>> vector can be customized.
>
> Doubling isn't ideal, that's true.  I'd like to see an exponential backoff, but
> I have too many suspicions that our hashes only work right now if the bucket
> size is a power of two.
>
> -- c

Yes, your doubts are right.

Currently, if we have a hash if M buckets, we construct a bit mask

hash->mask = M-1

When we look for the bucket to put/get a key we use something like this:

bucket_index = (hash_fn(key)) & hash->mask

Alternative approach that supports arbitrary values for M will be:

bicket_index = (hash_fn(key)) % M

The claim why it is like that (masking and power of 2 buckets) is that 
this makes hash table expanding cheaper.  A am not so sure that this is 
true because I have not solid understanding of the current expand/resize 
algorithm.

luben




More information about the parrot-dev mailing list