scala模拟ketama算法


ketama是memcached客户端使用的一种一致性哈希算法。是为了解决余数实现分布不均匀的问题,在last.fm的一篇博客中首次提到。这里有一些介绍链接,其中第一篇是原博客。

原博客里面提到算法步骤如下:

  1. Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
  2. Hash each server string to several (100-200) unsigned ints
  3. Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
  4. Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
  5. To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
  6. If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.

单独看这些可能不容易理解,我看了第一个链接的Java代码后大致了解了,按照其中逻辑用Scala的函数式风格写了ketama的分布和查找算法:

import java.security.MessageDigest
import scala.collection.immutable.{SortedMap, TreeMap}

object Ketama {

  def md5(str: String): Array[Byte] = 
    MessageDigest.getInstance("MD5").digest(str.getBytes)

  def digestToLong(bytes: Array[Byte]): Long = bytes.zipWithIndex.map{
    case (n, i) => (n & 0xFF).toLong << (i * 8)
  }.reduceLeft(_ | _)

  def setUpServers(
      servers: List[String]): SortedMap[Long, String] = TreeMap{
    servers.flatMap(s =>
      md5(s).grouped(4).toList.map(subHash =>
        (digestToLong(subHash), s)
      )
    ): _*
  }

  def lookupCache(key: String, 
      serverMap: SortedMap[Long, String]): String = {
    val keyHash = digestToLong(md5(key).slice(0, 4))
    serverMap.find(_._1 > keyHash) match {
      case Some((_, server)) => server
      case _ => serverMap.head._2 // first server of map
    }
  }

  def main(args: Array[String]): Unit = {
    val serverMap = setUpServers(List("1.1.1.1", "2.2.2.2", "3.3.3.3"))
    (0 to 20).foreach{n =>
      println(lookupCache("key" + n, serverMap))
    }
  }

}

解释一下各个函数:

md5不用多说,根据字符串的byte生成摘要,16个byte。
digestToLong,ketama对缓存key使用的一种哈希算法。4个byte,和0xFF做与操作(应该是去除负数,求补?),第一个byte不做偏移,第二个偏移8位,第三个偏移16位,第四个偏移24位,偏移后做或操作。用Java全写的话就是这样:

 (long) bytes[0] & 0xFF) | 
((long) bytes[1] & 0xFF) << 8) | 
((long) bytes[2] & 0xFF) << 16) | 
((long) bytes[3] & 0xFF) << 24)

形象点讲就是把4个byte拼成一个32位数再和0xFFFFFFFF做与操作得到一个非负数。

setUpServer,ketama核心部分算法,对每个server做md5得到16个byte,每4个byte执行digestToLong得到一个2^32以内的非负数作为server的key。所以每个server有4个key,换句话说有3个虚拟节点。这里额外还做了按照key对映射排序。

lookupCache,对key做md5得到16个byte,取4个byte执行digestToLong得到缓存hash,在服务器映射中找到第一个大于缓存hash的server,如果没找到取第一个server。

以下就是个人认为ketama算法的理解与实现。