`
yaojingguo
  • 浏览: 208935 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

Consistent Hashing

阅读更多

Consistent hashing is hot since the popularity of Dynamo and its open source
implementation Cassandra. 

For a quick learning without involving too much theory, I suggest

- http://www.java.net/blog/2007/11/27/consistent-hashing
This post is written by Tom White and has a simple implementation of consistent
hashing in Java. I have add the missing part of hash function missing in the
post. Here is the source code.

import java.util.*;
import java.security.*;
import java.math.*;

class HashFunction {

  private static final long mask32 = (1l<<8) - 1;
  private static final BigInteger divider = BigInteger.valueOf(mask32);

  public int hash(Object o) {
    try {
      String str = o.toString();
      MessageDigest m = MessageDigest.getInstance("MD5");
      m.reset();
      m.update(str.getBytes("UTF-8"));
      byte[] digest = m.digest();
      BigInteger bigInt = new BigInteger(1, digest);
      return bigInt.mod(divider).intValue();
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }

  public static void debug() {
  }
}

public class ConsistentHash<T> {
  private final HashFunction hashFunction;
  private final int numberOfReplicas;
  private final SortedMap<Integer, T> circle = new TreeMap<Integer, T>();

  public ConsistentHash(HashFunction hashFunction, 
      int numberOfReplicas, Collection<T> nodes) {
    this.hashFunction = hashFunction;
    this.numberOfReplicas = numberOfReplicas;

    for (T node : nodes) 
      add(node);
  }

  public void add(T node) {
    for (int i = 0; i < numberOfReplicas; i++) 
      circle.put(hashFunction.hash(node.toString() + i), node);
  }

  public void remove(T node) {
    for (int i = 0; i < numberOfReplicas; i++)
      circle.remove(hashFunction.hash(node.toString() + i));
  }

  /*
   * Returns the node for the given key.
   */
  public T get(Object key) {
    if (circle.isEmpty())
      return null;

    int hash = hashFunction.hash(key);
    if (!circle.containsKey(hash)) {
      SortedMap<Integer, T> tailMap = circle.tailMap(hash);
      hash = tailMap.isEmpty() ? circle.firstKey() : tailMap.firstKey();
    }

    return circle.get(hash);
  }

  public String toString() {
    return circle.toString();
  }

  public static void main (String [] args)  {
    HashFunction hashFunc = new HashFunction();

    List<String> ls = new ArrayList<String>();
    ls.add("A");
    ls.add("B");
    ls.add("C");

    int num = Integer.parseInt(args[0]);
    ConsistentHash<String> ch = new ConsistentHash<String>(hashFunc, 
        num, 
        ls);
    System.out.println(ch);  

    ch.remove("A");
    System.out.println(ch);  
  }

}

 

- http://michaelnielsen.org/blog/consistent-hashing
This post has a good explanation why a lot of key-value will need to move to
other nodes. Before reading this post, I spent a long time figuring out the
explanation by myself. Here I give a concrete example to help me to understnad
it.

Imagine there are 3 machines used as web cache for key-value pair (k, v). The
function for allocating (k, v) is is hash(k) mode 3.

+-----------+---+---+---+----+----+----+----+
| machine-0 | 0 | 3 | 6 | 9  | 12 | 15 | 18 |
+-----------+---+---+---+----+----+----+----+
| machine-1 | 1 | 4 | 7 | 10 | 13 | 16 | 19 |
+-----------+---+---+---+----+----+----+----+
| machine-2 | 2 | 5 | 8 | 11 | 14 | 17 | 20 |
+---------------+---+---+----+----+----+----+

 

Now a new machine is added. The function is hash(k) mode 4. This new function
indicates the following key-value pair allocation among the machines.

+-----------+---+---+----+----+----+
| machine-0 | 0 | 4 | 8  | 12 | 16 |
+-----------+---+---+----+----+----+
| machine-1 | 1 | 5 | 9  | 13 | 17 |
+-----------+---+---+----+----+----+
| machine-2 | 2 | 6 | 10 | 14 | 18 |
+-----------+---+---+----+----+----+
| machine-3 | 3 | 7 | 11 | 15 | 19 |
+-----------+---+---+----+----+----+

 

 

0
0
分享到:
评论

相关推荐

    Python库 | ConsistentHashing-0.1.9.tar.gz

    而`ConsistentHashing-0.1.9.tar.gz`这个压缩包文件则包含了一个名为`ConsistentHashing`的Python库,版本为0.1.9。这个库主要涉及到一致性哈希(Consistent Hashing)算法,它在分布式系统和负载均衡中扮演着重要...

    Consistent Hashing and Random Trees

    Consistent Hashing and Random Trees

    (10)karger-Consistent Hashing.pdf

    “ConsistentHashingandRandomTrees: Distributed Caching Protocols for Relieving Hot Spots on the Worldwide Web”指的是由David Karger等人撰写的关于一致性哈希算法(Consistent Hashing)以及如何运用该算法...

    一致性Hash(Consistent Hashing)原理剖析1

    一致性哈希(Consistent Hashing)是一种用于分布式系统的哈希算法,主要应用于分布式缓存、分布式数据库等场景,目的是在节点动态增减时保持哈希表的稳定性,从而最小化数据迁移的影响。它解决了传统哈希取模方法在...

    Jump-Consistent-Hashing-Evaluation:对一致性哈希的评估,尤其是 Google 的 Jump Consistent Hashing

    跳跃一致哈希计算 甚至服务器之间的数据分布也非常重要:另一个重要方面是能够... 关于一致性哈希,使用的算法是谷歌的论文“A Fast, Minimal Memory, Consistent Hash Algorithm”中提出的Jump Consistent Hashing。

    Ketama一致性Hash算法(含Java代码) 1

    一致性哈希算法(Consistent Hashing)是一种在分布式系统中平衡数据分布的策略,尤其适用于缓存服务如Memcached或Redis。它的核心思想是通过哈希函数将对象映射到一个固定大小的环形空间中,然后将服务器也映射到这个...

    一致性哈希算法 consistent hashing

    在分布式系统中,常常需要使用缓存,而且通常是集群,访问缓存和添加缓存都需要一个 hash 算法来寻找到合适的 Cache 节点。但,通常不是用取余hash,而是使用我们今天的主角—— 一致性 hash 算法。

    consistent-hash:一致性哈希工具类 Implementing Consistent Hashing in Kotlin

    Implementing Consistent Hashing in Kotlin Java Kotlin实现的一致性哈希工具 简单示例 val a = HostPortPhysicalNode("A", "192.169.1.1", 8080) val b = HostPortPhysicalNode("B", "192.169.1.2", 8080) val c =...

    Consistent-Hashing:Consistent Hashing 一致哈希

    一致性哈希(Consistent Hashing)是一种分布式哈希算法,主要应用于分布式缓存、负载均衡等领域,例如在Redis、Memcached等系统中广泛使用。它解决了传统哈希算法在节点动态增减时导致的大量数据迁移问题。在Java中...

    Go-Consistent-hashing:Go中的散列环实现

    一致性哈希(Consistent Hashing)是一种在分布式系统中解决数据分片问题的算法,它在Go语言中的实现对于构建可扩展且容错的服务至关重要。在Go开发中,尤其是在涉及分布式缓存、负载均衡等场景下,一致性哈希能够...

    开源项目-lafikl-consistent.zip

    开源项目-lafikl-consistent.zip,lafikl/consistent: a package for Consistent Hashing and Consistent Hashing With Bounded Loads.

    consistent-hashing:一致哈希的简单JavaScript实现

    安装使用以下命令安装依赖项: $ npm install 用法var ConsistentHashing = require ( './consistent_hashing' ) ;var nodeNames = [ 'node1' , 'node2' , 'node3' , 'node4' , 'node5' , 'node6' ] ;var replica...

    09-散列3. Hashing - Hard Version (30).zip

    8. 散列在分布式系统中的应用:例如一致性哈希(Consistent Hashing),用于在分布式缓存和分布式数据库中分配节点,保持在节点增减时尽量小的影响。 9. 散列与安全性:在密码学中,散列函数常用于密码存储,通过...

    Ketama Hashing Algorithm

    Ketama Hashing算法,全称为“Consistent Hashing with Ketama”,是一种在分布式系统中实现负载均衡的哈希算法。它的主要目的是在增加或减少服务器节点时,尽可能少地改变已分配的键(keys)到服务器的映射关系。在...

    一致性哈希算法(ketama hashing)

    一致性哈希算法(Consistent Hashing)是一种在分布式系统中实现负载均衡的算法,尤其在分布式缓存如Memcached和Redis等场景下广泛使用。它解决了传统哈希算法在节点增减时导致的大量数据迁移问题,提高了系统的可用...

    分布式的Key-Value存储系统Ringo.zip

    Ringo is an experimental, distributed, replicating key-value store based on consistent hashing and immutable data. Unlike many general-purpose databases, Ringo is designed for a specific use case: For...

Global site tag (gtag.js) - Google Analytics