Linked List versus Hashed Linked iIst

Greg Freemyer greg.freemyer at gmail.com
Mon Aug 18 19:01:24 EDT 2014


On Mon, Aug 18, 2014 at 4:20 PM, Nick Krause <xerofoify at gmail.com> wrote:
> What are the advantages of the hashed linked list version over the
> standard one and does it
> increase the memory usage and overhead of the linked list more if I
> use a hashed version?

Seriously?  Do you know what a hash is?

A hash is a well-defined many to one algorithm.

If I have a universe of a million items that hash down to 100 unique
hashes, then I can group those million items by hash and have 100
groups of roughly 10,000 items each.

The better the hashing algorithm versus my original universe of 1
million items, the more even the distribution.

Now that I have 100 segregated groups I can build an array of 100
linked lists all maintained separately.

Thus:

hash_index = my_hash(item)

add_item(linked_list[hash_item], item) is how I add my item to the
hashed linked list.

is_in_list(linked_list[hash_item], item) is how I check to see if my
item is already in the list.

So in my example I have to have 100 linked lists, but each list is on
average 100x smaller than a simple linked list would be.

Is adding an item to the hashed linked list faster?

Absolutely not, I have to hash the item first then do a normal linked
list insertion.  That will always be slower.

Is finding the item faster?

That is the whole point of the exercise.  The theory is you ONLY use a
hashed linked list if the overhead of hashing the item is less than
the amount of time saved by traversing shorter lists when you search.

It is the job of the programmer to make the determination if a hashed
list is a better choice or not on a case by case basis.  It depends on
the length of the list without breaking it into pieces and how well
the hash algorithm can do at generating roughly similar segregated
groups.

For the size question, write yourself a userspace app and test it.
Obviously that is more work than asking here, but it is ASSUMED you
are doing research on your OWN before you post questions here.

fyi: this question has little to do with the linux kernel.  It is part
of what people mean when they say you need to go learn c before you
start on the kernel.  Using linked lists and hashed linked lists is
stuff you can fully explore in userspace.

Greg



More information about the Kernelnewbies mailing list