- 浏览: 32463 次
- 性别:
- 来自: 北京
-
文章分类
对Ketama的介绍
Ketama is an implementation of a consistent hashing algorithm, meaning you can add or remove servers from the memcached pool without causing a complete remap of all keys.
Here’s how it works:
* Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
* Hash each server string to several (100-200) unsigned ints
* Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
* Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
* To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
* If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.
If you then add or remove a server from the list, only a small proportion of keys end up mapping to different servers.
前面介绍了ketama的一些概念和特性,接下来我们废话少说,直接上代码。
具体java实现如下:
下面是测试代码:
分布平均性测试:测试随机生成的众多key是否会平均分布到各个结点上
节点增删测试:在环上插入N个结点,每个节点nCopies个虚拟结点。随机生成众多key,在增删节点时,测试同一个key选择相同节点的概率
Ketama is an implementation of a consistent hashing algorithm, meaning you can add or remove servers from the memcached pool without causing a complete remap of all keys.
Here’s how it works:
* Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
* Hash each server string to several (100-200) unsigned ints
* Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
* Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
* To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
* If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.
If you then add or remove a server from the list, only a small proportion of keys end up mapping to different servers.
前面介绍了ketama的一些概念和特性,接下来我们废话少说,直接上代码。
具体java实现如下:
public final class KetamaNodeLocator { /** * 存储节点 */ private TreeMap<Long, Node> ketamaNodes; private HashAlgorithm hashAlg; private int numReps = 100; /** * @param nodes 实际服务器的节点 * @param alg 采用的hash算法 * @param nodeCopies 虚节点数量 */ public KetamaNodeLocator(List<Node> nodes, HashAlgorithm alg, int nodeCopies) { hashAlg = alg; ketamaNodes = new TreeMap<Long, Node>(); numReps = nodeCopies; //对所有节点,生成nCopies个虚拟结点 for (Node node : nodes) { //虚拟结点分为四组 for (int i = 0; i < numReps / 4; i++) { //getKeyForNode方法为这组虚拟结点得到惟一名称 byte[] digest = hashAlg.computeMd5(getKeyForNode(node, i)); /** Md5是一个16字节长度的数组,将16字节的数组每四个字节一组, 分别对应一个虚拟结点,这就是为什么上面把虚拟结点分为组的原因*/ for (int h = 0; h < 4; h++) { // Md5编码后,每个虚拟结点对应Md5码16个字节中的4个,组成一个long型数值,做为这个虚拟结点在环中的惟一key long m = hashAlg.hash(digest, h); ketamaNodes.put(m, node); } } } } /** * Get the primary location for the given key * * @param k * 需要从节点上获取值的Key * @return * 存储key值的节点 */ public Node getPrimary(final String k) { byte[] digest = hashAlg.computeMd5(k); Node rv = getNodeForKey(hashAlg.hash(digest, 0)); return rv; } private Node getNodeForKey(long hash) { final Node rv; Long key = hash; if (!ketamaNodes.containsKey(key)) { //得到大于当前key的那个子Map,然后从中取出第一个key,就是大于且离它最近的那个key SortedMap<Long, Node> tailMap = ketamaNodes.tailMap(key); if (tailMap.isEmpty()) { key = ketamaNodes.firstKey(); } else { key = tailMap.firstKey(); } //For JDK1.6 version // key = ketamaNodes.ceilingKey(key); // if (key == null) { // key = ketamaNodes.firstKey(); // } } //如果找到这个节点,直接取节点,返回 rv = ketamaNodes.get(key); return rv; } /** * Returns a uniquely identifying key, suitable for hashing by the * KetamaNodeLocator algorithm. * * @param node The Node to use to form the unique identifier * @param repetition The repetition number for the particular node in * question (0 is the first repetition) * @return The key that represents the specific repetition of the node */ private String getKeyForNode(Node node, int repetition) { return node.getName() + repetition; } }
public enum HashAlgorithm { /** * MD5-based hash algorithm used by ketama. */ KETAMA_HASH; public long hash(byte[] digest, int nTime) { long rv = ((long) (digest[3+nTime*4] & 0xFF) << 24) | ((long) (digest[2+nTime*4] & 0xFF) << 16) | ((long) (digest[1+nTime*4] & 0xFF) << 8) | (digest[0+nTime*4] & 0xFF); return rv & 0xffffffffL; /* Truncate to 32-bits */ } /** * Get the md5 of the given key. */ public byte[] computeMd5(String k) { MessageDigest md5; try { md5 = MessageDigest.getInstance("MD5"); } catch (NoSuchAlgorithmException e) { throw new RuntimeException("MD5 not supported", e); } md5.reset(); byte[] keyBytes = null; try { keyBytes = k.getBytes("UTF-8"); } catch (UnsupportedEncodingException e) { throw new RuntimeException("Unknown string :" + k, e); } md5.update(keyBytes); return md5.digest(); } }
public class Node { private String name; public Node(String name) { this.name = name; } /** * 可以用机器的ip + port * * @return */ public String getName() { return name; } public String toString() { return "name : " + name; } public boolean equals(Object obj) { if(obj == null) return false; if(!(obj instanceof Node)) return false; Node node = (Node) obj; return this.name.equals(node.getName()); } }
下面是测试代码:
分布平均性测试:测试随机生成的众多key是否会平均分布到各个结点上
public class HashAlgorithmTest { static Random ran = new Random(); /** key's count */ private static final Integer EXE_TIMES = 100000; private static final Integer NODE_COUNT = 5; private static final Integer VIRTUAL_NODE_COUNT = 160; public static void main(String[] args) { HashAlgorithmTest test = new HashAlgorithmTest(); /** Records the times of locating node*/ Map<Node, Integer> nodeRecord = new HashMap<Node, Integer>(); List<Node> allNodes = test.getNodes(NODE_COUNT); KetamaNodeLocator locator = new KetamaNodeLocator(allNodes, HashAlgorithm.KETAMA_HASH, VIRTUAL_NODE_COUNT); //模拟初始化所有的key值 List<String> allKeys = test.getAllStrings(); for (String key : allKeys) { Node node = locator.getPrimary(key); Integer times = nodeRecord.get(node); if (times == null) { nodeRecord.put(node, 1); } else { nodeRecord.put(node, times + 1); } } System.out.println("Nodes count : " + NODE_COUNT + ", Keys count : " + EXE_TIMES + ", Normal percent : " + (float) 100 / NODE_COUNT + "%"); System.out.println("-------------------- boundary ----------------------"); for (Map.Entry<Node, Integer> entry : nodeRecord.entrySet()) { System.out.println("Node name :" + entry.getKey() + " - Times : " + entry.getValue() + " - Percent : " + (float)entry.getValue() / EXE_TIMES * 100 + "%"); } } /** * Gets the mock node by the material parameter * * @param nodeCount * the count of node wanted * @return * the node list */ private List<Node> getNodes(int nodeCount) { List<Node> nodes = new ArrayList<Node>(); for (int k = 1; k <= nodeCount; k++) { Node node = new Node("node" + k); nodes.add(node); } return nodes; } /** * All the keys */ private List<String> getAllStrings() { List<String> allStrings = new ArrayList<String>(EXE_TIMES); for (int i = 0; i < EXE_TIMES; i++) { allStrings.add(generateRandomString(ran.nextInt(50))); } return allStrings; } /** * To generate the random string by the random algorithm * <br> * The char between 32 and 127 is normal char * * @param length * @return */ private String generateRandomString(int length) { StringBuffer sb = new StringBuffer(length); for (int i = 0; i < length; i++) { sb.append((char) (ran.nextInt(95) + 32)); } return sb.toString(); } }
节点增删测试:在环上插入N个结点,每个节点nCopies个虚拟结点。随机生成众多key,在增删节点时,测试同一个key选择相同节点的概率
public class HashAlgorithmPercentTest { static Random ran = new Random(); /** key's count */ private static final Integer EXE_TIMES = 100000; private static final Integer NODE_COUNT = 50; private static final Integer VIRTUAL_NODE_COUNT = 160; static List<String> allKeys = null; static { allKeys = getAllStrings(); } public static void main(String[] args) { Map<String, List<Node>> map = generateRecord(); List<Node> allNodes = getNodes(NODE_COUNT); System.out.println("Normal case : nodes count : " + allNodes.size()); call(allNodes, map); allNodes = getNodes(NODE_COUNT + 8); System.out.println("Added case : nodes count : " + allNodes.size()); call(allNodes, map); allNodes = getNodes(NODE_COUNT - 10); System.out.println("Reduced case : nodes count : " + allNodes.size()); call(allNodes, map); int addCount = 0; int reduceCount = 0; for (Map.Entry<String, List<Node>> entry : map.entrySet()) { List<Node> list = entry.getValue(); if (list.size() == 3) { if (list.get(0).equals(list.get(1))) { addCount++; } if (list.get(0).equals(list.get(2))) { reduceCount++; } } else { System.out.println("It's wrong size of list, key is " + entry.getKey() + ", size is " + list.size()); } } System.out.println(addCount + " --- " + reduceCount); System.out.println("Same percent in added case : " + (float) addCount * 100/ EXE_TIMES + "%"); System.out.println("Same percent in reduced case : " + (float) reduceCount * 100/ EXE_TIMES + "%"); } private static void call(List<Node> nodes, Map<String, List<Node>> map) { KetamaNodeLocator locator = new KetamaNodeLocator(nodes, HashAlgorithm.KETAMA_HASH, VIRTUAL_NODE_COUNT); for (Map.Entry<String, List<Node>> entry : map.entrySet()) { Node node = locator.getPrimary(entry.getKey()); if (node != null) { List<Node> list = entry.getValue(); list.add(node); } } } private static Map<String, List<Node>> generateRecord() { Map<String, List<Node>> record = new HashMap<String, List<Node>>(EXE_TIMES); for (String key : allKeys) { List<Node> list = record.get(key); if (list == null) { list = new ArrayList<Node>(); record.put(key, list); } } return record; } /** * Gets the mock node by the material parameter * * @param nodeCount * the count of node wanted * @return * the node list */ private static List<Node> getNodes(int nodeCount) { List<Node> nodes = new ArrayList<Node>(); for (int k = 1; k <= nodeCount; k++) { Node node = new Node("node" + k); nodes.add(node); } return nodes; } /** * All the keys */ private static List<String> getAllStrings() { List<String> allStrings = new ArrayList<String>(EXE_TIMES); for (int i = 0; i < EXE_TIMES; i++) { allStrings.add(generateRandomString(ran.nextInt(50))); } return allStrings; } /** * To generate the random string by the random algorithm * <br> * The char between 32 and 127 is normal char * * @param length * @return */ private static String generateRandomString(int length) { StringBuffer sb = new StringBuffer(length); for (int i = 0; i < length; i++) { sb.append((char) (ran.nextInt(95) + 32)); } return sb.toString(); } }
发表评论
-
一致性哈希(Consistent Hash) 概念
2012-08-07 10:46 727协议简介 一致性哈 ... -
Memcached分布式算法(consistent hash)
2012-08-07 10:46 884Memcached分布式算法在网上一搜可以找到一大片了,不过对 ... -
memcached的应用和兼容程序
2012-08-06 18:21 713mixi案例研究 mixi在提供 ... -
memcached的分布式算法
2012-08-06 17:48 725memcached的分布式 正如第1次中介绍的那样, memc ... -
memcached的删除机制和发展方向
2012-08-06 17:36 887memcached在数据删除方面有效利用资源 数据不会真正从m ... -
memcached的内存存储
2012-08-06 17:32 702Slab Allocation机制:整理内存以便重复使用 最近 ... -
memcached的基础
2012-08-06 17:22 795memcached是什么? memcached 是以LiveJ ... -
memcached安装步骤和启动参数说明
2012-07-23 17:22 1278一、安装步骤 1、libevent ...
相关推荐
一致性哈希算法(Consistent Hashing)是一种在分布式系统中平衡数据分布的策略,尤其适用于缓存服务如Memcached或Redis。它的核心思想是通过哈希函数将对象映射到一个固定大小的环形空间中,然后将服务器也映射到这个...
Ketama Hashing算法,全称为“Consistent Hashing with Ketama”,是一种在分布式系统中实现负载均衡的哈希算法。它的主要目的是在增加或减少服务器节点时,尽可能少地改变已分配的键(keys)到服务器的映射关系。在...
一致性哈希算法(Consistent Hashing)是一种在分布式系统中实现负载均衡的算法,尤其在分布式缓存如Memcached和Redis等场景下广泛使用。它解决了传统哈希算法在节点增减时导致的大量数据迁移问题,提高了系统的可用...
一致性哈希(Consistent Hashing)是一种分布式哈希(Distributed Hash Table,DHT)算法,主要用于解决在分布式系统中的数据存储和检索问题。在云计算、缓存系统(如Redis、Memcached)以及负载均衡等领域广泛应用...
Ketama算法是基于一致性哈希的一种优化实现,由Last.fm公司的Simon Willison提出,其目标是在节点动态增减时保持数据分布的稳定性和效率。 在传统的哈希算法中,如果添加或移除一个服务器节点,可能导致大部分键值...
Ketama 是一个C语言编写的库,专门设计用于实现一致哈希算法,并提供与其他编程语言的接口绑定。一致哈希是一种分布式哈希技术,它在分布式缓存系统中广泛使用,如Memcached或Redis,目的是在添加、删除或移动节点时...
开发者可能在这里定义了路由策略,如一致性哈希(Consistent Hashing),以确保数据分布的均匀性和在节点添加或删除时最小化数据迁移。 2. **ketama.c**:这是一个实现了Ketama一致性哈希算法的源文件。Ketama是一...
h uhashring在纯Python中实现一致的哈希。 一致的散列主要用于分布式系统/缓存/数据库,因为这样可以避免在环中添加或删除节点时(在libketama上称为continuum)完全替换键-节点映射。... 实施,密钥分发和ketama兼容
本文将深入探讨"基于ketama算法和eredis项目的redis erlang驱动",即smart eredis,它是一种实现数据分布式存储的解决方案,尤其关注其使用的一致性哈希策略。 1. **Ketama算法**: Ketama算法是一种在分布式环境中...
-k use ketama key allocation algorithm -f file, unix socket path to listen on. default is off -i number, set max keep alive connections for one memcached server, default is 20 -v verbose 修正问题...
-k use ketama key allocation algorithm -f file, unix socket path to listen on. default is off -i number, set max keep alive connections for one memcached server, default is 20 -v verbose 修正问题...
《Go-memcache的Ketama选择器:深入解析与应用》 在分布式系统中,缓存服务扮演着至关重要的角色,它能够显著提高数据访问速度,减轻数据库的压力。Go语言作为一门现代化的编程语言,提供了丰富的库来支持各种功能...
一致性哈希算法(Consistent Hashing)是一种常用于分布式系统中的数据分片策略,它有效地解决了数据在多台服务器间均匀分布的问题,同时减少了因节点加入或离开时的数据迁移成本。 首先,一致性哈希的基本原理是将...
一致性哈希(Consistent Hashing)是一种分布式哈希算法,主要应用于分布式缓存、负载均衡等领域,以解决在分布式环境中动态添加或删除节点时,尽可能少地改变已有的哈希映射关系。在这个Java实现中,我们看到的是...
- **支持ConsistentHashing的函数库**:如ketama等库提供了实现一致性哈希算法的功能。 #### memcached的应用和兼容程序 **mixi案例研究** - **服务器配置和数量**:介绍了mixi使用的服务器硬件和软件配置。 - **...
基于sentinel的redis集群的客户端,支持自动主从切换,采用ketama一致性hash算法哨兵客户端介绍sentinel-client使用Redis做单节点的数据存储,Sentinel做高可用服务的K-V存储集群。 高爾夫方案高可用方案是基于Redis...
记录阅读过的开源代码程序和注释ConsistentHashing一致性哈希的java实现(Ketama),加了注释,可参考。NetworkProgramming网络编程相关的程序源码(包括某些开源软件)及其注释AND一些经过自己修改的可重用的功能...
- **支持ConsistentHashing的函数库**:如Ketama等库提供了实现一致性哈希功能的支持。 #### 五、Memcached的应用和兼容程序 **5.1 mixi案例研究** - **服务器配置和数量**:mixi使用大量的memcached服务器集群来...
memcached支持简单的分布式策略,如哈希一致性(consistent hashing),当新的服务器加入或离开时,数据迁移的影响最小。此外,还可以使用客户端库实现更复杂的策略,如ketama一致性哈希。 **监控和优化:** 为了...