`
sunwinner
  • 浏览: 202536 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

Availability and Reliability with HBase

 
阅读更多

Availability

Availability in the context of HBase can be defined as the ability of the system to handle failures. The most common failures cause one or more nodes in the HBase cluster to fall off the cluster and stop serving requests. This could be because of hardware on the node failing or the software acting up for some reason. Any such failure can be considered a network partition between that node and the rest of the cluster.

 

When a RegionServer becomes unreachable for some reason, the data it was serving needs to instead be served by some other RegionServer. HBase can do that and keep its availability high. But if there is a network partition and the HBase masters are separated from the cluster or the ZooKeepers are separated from the cluster, the slaves can’t do much on their own. This goes back to what we said earlier: availability is best defined by the kind of failures a system can handle and the kind it can’t. It isn’t a binary property, but instead one with various degrees.

 

Higher availability can be achieved through defensive deployment schemes. For instance, if you have multiple masters, keep them in different racks. 

 

Reliability and durability

Reliability is a general term used in the context of a database system and can be thought of as a combination of data durability and performance guarantees in most cases. Data durability, as you can imagine, is important when you’re building applications atop a database. HBase has certain guarantees in terms of data durability by virtue of the system architecture.

 

HBase assumes two properties of the underlying storage that help it achieve the availability and reliability it offers to its clients.

 

Single namespace

HBase stores its data on a single file system. It assumes all the RegionServers have access to that file system across the entire cluster. The file system exposes a single namespace to all the RegionServers in the cluster. The data visible to and written by one RegionServer is available to all other RegionServers. This allows HBase to make availability guarantees. If a RegionServer goes down, any other RegionServer can read the data from the underlying file system and start serving the regions that the first RegionServer was serving.

 

At this point, you may be thinking that you could have a network-attached storage (NAS) that was mounted on all the servers and store the data on that. That’s theoretically doable, but there are implications to every design and implementation choice. Having a NAS that all the servers read/write to means your disk I/O will be bottlenecked by the interlink between the cluster and the NAS. You can have fat interlinks, but they will still limit the amount you can scale to. HBase made a design choice to use distributed file systems instead and was built tightly coupled with HDFS. HDFS provides HBase with a single namespace, and the DataNodes and RegionServers are collocated in most clusters. Collocating these two processes helps in that RegionServers can read and write to the local DataNode, thereby saving network I/O whenever possible. There is still network I/O, but this optimization reduces the costs.

 

Reliability and failure resistance

HBase assumes that the data it persists on the underlying storage system will be accessible even in the face of failures. If a server running the RegionServer goes down, other RegionServers should be able to take up the regions that were assigned to that server and begin serving requests. The assumption is that the server going down won’t cause data loss on the underlying storage. A distributed file system like HDFS achieves this property by replicating the data and keeping multiple copies of it. At the same time, the performance of the underlying storage should not be impacted greatly by the loss of a small percentage of its member servers.

 

Theoretically, HBase could run on top of any file system that provides these properties. But HBase is tightly coupled with HDFS and has been during the course of its development. Apart from being able to withstand failures, HDFS provides certain write semantics that HBase uses to provide durability guarantees for every byte you write to it.

 

From HBase in action

分享到:
评论

相关推荐

    linux基础进阶笔记

    linux基础进阶笔记,配套视频:https://www.bilibili.com/list/474327672?sid=4493093&spm_id_from=333.999.0.0&desc=1

    IMG20241115211541.jpg

    IMG20241115211541.jpg

    Sen2_ARI_median.txt

    GEE训练教程——Landsat5、8和Sentinel-2、DEM和各2哦想指数下载

    毕业设计&课设_基于 flask-whoosh-jieba 的代码,涉及文件管理及问题修复.zip

    该资源内项目源码是个人的课程设计、毕业设计,代码都测试ok,都是运行成功后才上传资源,答辩评审平均分达到96分,放心下载使用! ## 项目备注 1、该资源内项目代码都经过严格测试运行成功才上传的,请放心下载使用! 2、本项目适合计算机相关专业(如计科、人工智能、通信工程、自动化、电子信息等)的在校学生、老师或者企业员工下载学习,也适合小白学习进阶,当然也可作为毕设项目、课程设计、作业、项目初期立项演示等。 3、如果基础还行,也可在此代码基础上进行修改,以实现其他功能,也可用于毕设、课设、作业等。 下载后请首先打开README.md文件(如有),仅供学习参考, 切勿用于商业用途。

    基于springboot家政预约平台源码数据库文档.zip

    基于springboot家政预约平台源码数据库文档.zip

    Ucharts添加stack和折线图line的混合图

    Ucharts添加stack和折线图line的混合图

    基于springboot员工在线餐饮管理系统源码数据库文档.zip

    基于springboot员工在线餐饮管理系统源码数据库文档.zip

    2015-2021年新能源汽车分地区、分类型、分级别销量逐月数据和进出口数据-最新出炉.zip

    新能源汽车进出口数据 1、时间跨度:2018-2020年 2、指标说明:包含如下指标的进出口数据:混合动力客车(10座及以上)、纯电动客车(10座及以上)、非插电式混合动力乘用车、插电式混合动力乘用车、纯电动乘用车 二、新能源汽车进出口月销售数据(分地区、分类型、分 级别) 1、数据来源:见资料内说明 2、时间跨度:2014年1月-2021年5月 4、指标说明: 包含如下指标 2015年1月-2021年5月新能源乘用车终端月度销量(分类型)部分内容如下: 新能源乘用车(单月值、累计值 )、插电式混合动力 月度销量合计(狭义乘用车轿车、SUV、MPV、交叉型乘用车); 月度销量同比增速(狭义乘用车轿车、SUV、MPV、交叉型乘用车); 累计销量合计(狭义乘用车轿车、SUV、IPV、交叉型乘用车); 累计销量同比增速(狭义乘用车轿车、SUV、MPV、交叉型乘用车); 累计结构变化(狭义乘用车轿车、SUV、IPV、交叉型乘用车); 2015年1月-2021年5月新能源乘用车终端月度销量(分地区)内容如下: 更多见资源内

    中心主题-241121215200.pdf

    中心主题-241121215200.pdf

    蓝奏云下载链接与密码整理

    内容概要:本文档提供了多个蓝奏云下载链接及其对应解压密码,帮助用户快速获取所需文件。 适合人群:需要从蓝奏云下载文件的互联网用户。 使用场景及目标:方便地记录并分享蓝奏云上文件的下载地址和密码,提高下载效率。 阅读建议:直接查看并使用提供的链接和密码即可。若遇到失效情况,请尝试联系上传者确认更新后的链接。

    Javaweb仓库管理系统项目源码.zip

    基于Java web 实现的仓库管理系统源码,适用于初学者了解Java web的开发过程以及仓库管理系统的实现。

    Python-文件重命名-自定义添加文字-重命名

    资源名称:Python-文件重命名-自定义添加文字-重命名 类型:windows—exe可执行工具 环境:Windows10或以上系统 功能: 1、点击按钮 "源原文"【浏览】表示:选择重命名的文件夹 2、点击按钮 "保存文件夹"【浏览】表示:保存的路径(为了方便可选择保存在 源文件中 ) 3、功能①:在【头部】添加自定义文字 4、功能②:在【尾部】添加自定义文字 5、功能③:输入源字符 ;输入替换字符 可以将源文件中的字符替换自定义的 6、功能④:自动加上编号_1 _2 _3 优点: 1、非常快的速度! 2、已打包—双击即用!无需安装! 3、自带GUI界面方便使用!

    JDK8安装包,为各位学习的朋友免费提供

    JDK8安装包

    Centos-7yum的rpm包

    配合作者 一同使用 作者地址没有次下载路径 https://blog.csdn.net/weixin_52372189/article/details/127471149?fromshare=blogdetail&sharetype=blogdetail&sharerId=127471149&sharerefer=PC&sharesource=weixin_45375332&sharefrom=from_link

    setup_python_geospatial_analysis.ipynb

    GEE训练教程

    毕业设计&课设_文成公主微信公众号全栈工程,含技术栈、架构及部署流程等内容.zip

    该资源内项目源码是个人的课程设计、毕业设计,代码都测试ok,都是运行成功后才上传资源,答辩评审平均分达到96分,放心下载使用! ## 项目备注 1、该资源内项目代码都经过严格测试运行成功才上传的,请放心下载使用! 2、本项目适合计算机相关专业(如计科、人工智能、通信工程、自动化、电子信息等)的在校学生、老师或者企业员工下载学习,也适合小白学习进阶,当然也可作为毕设项目、课程设计、作业、项目初期立项演示等。 3、如果基础还行,也可在此代码基础上进行修改,以实现其他功能,也可用于毕设、课设、作业等。 下载后请首先打开README.md文件(如有),仅供学习参考, 切勿用于商业用途。

    基于springboot交通感知与车路协同系统源码数据库文档.zip

    基于springboot交通感知与车路协同系统源码数据库文档.zip

    基于springboot+vue 雅妮电影票购买系统源码数据库文档.zip

    基于springboot+vue 雅妮电影票购买系统源码数据库文档.zip

    使用 HTML5 实现拖放交互:音效与提示功能的完整实现

    为了更好地理解 HTML5 的拖放功能,我们设计了一个简单有趣的示例:将水果从水果区拖放到购物笼中,实时更新数量和价格,并在所有水果被成功放置后,播放音效并显示提示。

    毕业设计&课设_基于 SSM 的大学生综合成绩测评系统(含信息及数据库脚本,体现系统架构及功能设计).zip

    该资源内项目源码是个人的课程设计、毕业设计,代码都测试ok,都是运行成功后才上传资源,答辩评审平均分达到96分,放心下载使用! ## 项目备注 1、该资源内项目代码都经过严格测试运行成功才上传的,请放心下载使用! 2、本项目适合计算机相关专业(如计科、人工智能、通信工程、自动化、电子信息等)的在校学生、老师或者企业员工下载学习,也适合小白学习进阶,当然也可作为毕设项目、课程设计、作业、项目初期立项演示等。 3、如果基础还行,也可在此代码基础上进行修改,以实现其他功能,也可用于毕设、课设、作业等。 下载后请首先打开README.md文件(如有),仅供学习参考, 切勿用于商业用途。

Global site tag (gtag.js) - Google Analytics