ketama是memcached客户端使用的一种一致性哈希算法。是为了解决余数实现分布不均匀的问题,在last.fm的一篇博客中首次提到。这里有一些介绍链接,其中第一篇是原博客。
- http://www.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients
- http://blog.csdnnet/kangojian/article/details/6708460
- http://tech.idv2.com/2008/07/24/memcached-004/
原博客里面提到算法步骤如下:
- Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
- Hash each server string to several (100-200) unsigned ints
- Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
- Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
- To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
- If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.
单独看这些可能不容易理解,我看了第一个链接的Java代码后大致了解了,按照其中逻辑用Scala的函数式风格写了ketama的分布和查找算法:
import java.security.MessageDigest import scala.collection.immutable.{SortedMap, TreeMap} object Ketama { def md5(str: String): Array[Byte] = MessageDigest.getInstance("MD5").digest(str.getBytes) def digestToLong(bytes: Array[Byte]): Long = bytes.zipWithIndex.map{ case (n, i) => (n & 0xFF).toLong << (i * 8) }.reduceLeft(_ | _) def setUpServers( servers: List[String]): SortedMap[Long, String] = TreeMap{ servers.flatMap(s => md5(s).grouped(4).toList.map(subHash => (digestToLong(subHash), s) ) ): _* } def lookupCache(key: String, serverMap: SortedMap[Long, String]): String = { val keyHash = digestToLong(md5(key).slice(0, 4)) serverMap.find(_._1 > keyHash) match { case Some((_, server)) => server case _ => serverMap.head._2 // first server of map } } def main(args: Array[String]): Unit = { val serverMap = setUpServers(List("1.1.1.1", "2.2.2.2", "3.3.3.3")) (0 to 20).foreach{n => println(lookupCache("key" + n, serverMap)) } } }
解释一下各个函数:
md5不用多说,根据字符串的byte生成摘要,16个byte。
digestToLong,ketama对缓存key使用的一种哈希算法。4个byte,和0xFF做与操作(应该是去除负数,求补?),第一个byte不做偏移,第二个偏移8位,第三个偏移16位,第四个偏移24位,偏移后做或操作。用Java全写的话就是这样:
(long) bytes[0] & 0xFF) | ((long) bytes[1] & 0xFF) << 8) | ((long) bytes[2] & 0xFF) << 16) | ((long) bytes[3] & 0xFF) << 24)
形象点讲就是把4个byte拼成一个32位数再和0xFFFFFFFF做与操作得到一个非负数。
setUpServer,ketama核心部分算法,对每个server做md5得到16个byte,每4个byte执行digestToLong得到一个2^32以内的非负数作为server的key。所以每个server有4个key,换句话说有3个虚拟节点。这里额外还做了按照key对映射排序。
lookupCache,对key做md5得到16个byte,取4个byte执行digestToLong得到缓存hash,在服务器映射中找到第一个大于缓存hash的server,如果没找到取第一个server。
以下就是个人认为ketama算法的理解与实现。