parrot-dev Digest, Vol 24, Issue 11

luben karavelov luben at unixsol.org
Wed Aug 11 19:45:15 UTC 2010


On 11.08.2010 00:32, chromatic wrote:
> On Tuesday 10 August 2010 at 14:26, luben karavelov wrote:
>
>> It helps with hash expansion because when you double the size of bucket
>> store you have to move arround only a half of the keys, and their new
>> locations is guarantted to be free
>
> I'm not sure I believe that's the case with our implementation.  Besides that,
> I was under the impression (see The Practice of Programming) that picking a
> prime number for the array size produces a better distribution of keys in
> buckets.  The argument is that the lack of a common factor between the array
> size, the hash seed, and likely hash values produces a better distribution.
>
> Maybe we need some sort of intrusive benchmark with likely hash values that
> shows the statistical goodness or badness of our implementation.
>
> -- c

Yes, prime number as a hash table size will be better for hash 
distribution but it will make hash resizing cost more. On other hand, it 
will permit to get non-constant grow factor that will make resizing less 
often and the hash table will be more space efficient. I nave taken a 
look on some hash table implementations and it is common to use a vector 
of prime numbers as first hash table sizes.

As a part of the hash-cleanup effort, I have attached a patch that moves 
the checks for constant keys/values out of the src/hash.c in 
corresponding PMCs. All test pass. The only functional difference is 
that if hash table is constant and is indexed by PMC I also check that 
key PMC is constant.

luben
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hash-constant-cleanup.diff
URL: <http://lists.parrot.org/pipermail/parrot-dev/attachments/20100811/524db3ca/attachment.diff>


More information about the parrot-dev mailing list