Availability
Availability in the context of HBase can be defined as the ability of the system to handle failures. The most common failures cause one or more nodes in the HBase cluster to fall off the cluster and stop serving requests. This could be because of hardware on the node failing or the software acting up for some reason. Any such failure can be considered a network partition between that node and the rest of the cluster.
When a RegionServer becomes unreachable for some reason, the data it was serving needs to instead be served by some other RegionServer. HBase can do that and keep its availability high. But if there is a network partition and the HBase masters are separated from the cluster or the ZooKeepers are separated from the cluster, the slaves can’t do much on their own. This goes back to what we said earlier: availability is best defined by the kind of failures a system can handle and the kind it can’t. It isn’t a binary property, but instead one with various degrees.
Higher availability can be achieved through defensive deployment schemes. For instance, if you have multiple masters, keep them in different racks.
Reliability and durability
Reliability is a general term used in the context of a database system and can be thought of as a combination of data durability and performance guarantees in most cases. Data durability, as you can imagine, is important when you’re building applications atop a database. HBase has certain guarantees in terms of data durability by virtue of the system architecture.
HBase assumes two properties of the underlying storage that help it achieve the availability and reliability it offers to its clients.
Single namespace
HBase stores its data on a single file system. It assumes all the RegionServers have access to that file system across the entire cluster. The file system exposes a single namespace to all the RegionServers in the cluster. The data visible to and written by one RegionServer is available to all other RegionServers. This allows HBase to make availability guarantees. If a RegionServer goes down, any other RegionServer can read the data from the underlying file system and start serving the regions that the first RegionServer was serving.
At this point, you may be thinking that you could have a network-attached storage (NAS) that was mounted on all the servers and store the data on that. That’s theoretically doable, but there are implications to every design and implementation choice. Having a NAS that all the servers read/write to means your disk I/O will be bottlenecked by the interlink between the cluster and the NAS. You can have fat interlinks, but they will still limit the amount you can scale to. HBase made a design choice to use distributed file systems instead and was built tightly coupled with HDFS. HDFS provides HBase with a single namespace, and the DataNodes and RegionServers are collocated in most clusters. Collocating these two processes helps in that RegionServers can read and write to the local DataNode, thereby saving network I/O whenever possible. There is still network I/O, but this optimization reduces the costs.
Reliability and failure resistance
HBase assumes that the data it persists on the underlying storage system will be accessible even in the face of failures. If a server running the RegionServer goes down, other RegionServers should be able to take up the regions that were assigned to that server and begin serving requests. The assumption is that the server going down won’t cause data loss on the underlying storage. A distributed file system like HDFS achieves this property by replicating the data and keeping multiple copies of it. At the same time, the performance of the underlying storage should not be impacted greatly by the loss of a small percentage of its member servers.
Theoretically, HBase could run on top of any file system that provides these properties. But HBase is tightly coupled with HDFS and has been during the course of its development. Apart from being able to withstand failures, HDFS provides certain write semantics that HBase uses to provide durability guarantees for every byte you write to it.
相关推荐
综上所述,《Robustness Communication Software -- Extreme Availability, Reliability and Scalability for Carrier-Grade System》这本书提供了关于如何设计高性能、高可靠性、高可用性以及高伸缩性的电信级系统...
在探讨Linux系统中的可靠性(Reliability)、可用性(Availability)和服务性(Serviceability),简称RAS特性时,首先需要明确RAS概念的由来及其在Linux环境下的应用。 RAS最初是IBM为衡量大型机(mainframe)的...
4.5 Reliability and Sampling Distribution Models 4.6 Sample Size Planning 4.7 Automated Accelerated Test Planning 4.8 DMT Methodology and Guidelines References 5. Screening and Monotoring 5.1 ...
Power Quality, Reliability, and Availability Reliability Indices Customer Cost of Reliability Reliability Targets History of Reliability Indices INTERRUPTION CAUSES Equipment Failures Animals ...
The increasing availability of molecular and genetic databases coupled with the growing power of computers gives biologists opportunities to address new issues, such as the patterns of molecular ...
这本书聚焦于四个核心概念:可靠性(Reliability)、可用性(Availability)、可维护性(Maintainability)和安全性(Safety),这些在现代工业,尤其是流程行业中具有至关重要的地位。 可靠性工程是确保系统或设备...
Pro SQL Server Always On Availability Groups is aimed at SQL Server architects, database administrators, and IT professionals who are tasked with architecting and deploying a high-availability and ...
在分布式系统设计中,CAP原理强调了一致性(Consistency)、可用性(Availability)和分区容忍性(Partition tolerance)三个关键概念。在任何分布式数据系统中,这三个属性往往难以同时满足,因此在设计时需根据具体需求...
藏经阁-WALLess HBase with persistent memory devices.pdf TITLE: 藏经阁-WALLess HBase with persistent memory devices.pdf 本文档主要介绍了 HBase 在持久性存储设备(Persistent Memory Devices)上的应用,...
主要章节 Why HBase Initial of HBase Architecture Availability Problems Availability Problems-- Phoenix Write Path Availability Problems-- Fail Recovery Availability Problems-- Fail ...Now and Future
本文档深入探讨了IBM DB2 for Linux, UNIX, and Windows (LUW) 的高可用性(High Availability, HA)和灾难恢复(Disaster Recovery, DR)选项。该文档由Whei-Jen Chen、Masafumi Otsuki、Paul Descovich、...
在大数据存储和处理领域中,HBase的高可用性(High Availability,简称HA)是保证业务连续性和用户数据安全的关键特性。HBase是一个开源的非关系型分布式数据库,它基于Google的BigTable设计,运行在Hadoop文件系统...
10.2.0.2 Patch Set - Availability and Known Issues
9.2.0.6 Patch Set - Availability and Known Issues
6 SQL Server AlwaysOn and Availability Groups 7 SQL Server Database Snapshots 8 SQL Server Data Replication 9 SQL Server Log Shipping 10 High Availability Options in the Cloud 11 High Availability and...
CAP理论是指在分布式系统中,一致性(Consistency)、可用性(Availability)和分区容忍性(Partition tolerance)三者不可兼得的原则。具体来说: - **一致性(C)**:所有数据备份在同一时刻是否保持相同的值。 - ...
This book will teach you how to use Storm for real-time data processing and to make your applications highly available with no downtime using Cassandra. The book starts off with the basics of Storm ...