Why count the hash value in this way?

Jonathan Neuschäfer j.neuschaefer at gmx.net
Sun Aug 4 06:35:27 EDT 2013


On Sun, Aug 04, 2013 at 05:38:55PM +0800, lx wrote:
> hi all:
>       In the function of link_path_walk() , it counts the hash value of the
> compoent of the pathname.
> Why "(prevhash + (c <<4) + (c >> 4))*11;"?


In the code you quoted it says:

  /* Hash courtesy of the R5 hash in reiserfs modulo sign bits */

A bit of googling led me to this[1] page, where it says:

     r5 - This hash is a modified version of rupasov hash. It is used by
     default and it is better to stick here until you have to support
     huge directories and unusual file-name patterns.

and:

     rupasov - This hash is invented by Yury Yu. Rupasov. It is fast and
     preserves locality, mapping lexicographically close file names to
     the close hash values. Never use it, as it has a high probability
     of hash collisions.

[1] https://reiser4.wiki.kernel.org/index.php/Mount


Reading the ReiserFS code and/or mailing lists might give you a clue
about how the R5 hash was designed.


HTH,
Jonathan Neuschäfer



More information about the Kernelnewbies mailing list