- 浏览: 3500024 次
- 性别:
- 来自: 北京
文章分类
最新评论
-
wanglf1207:
EJB的确是个不错的产品,只是因为用起来有点门槛,招来太多人吐 ...
weblogic-ejb-jar.xml的元素解析 -
qwfys200:
总结的不错。
Spring Web Flow 2.0 入门 -
u011577913:
u011577913 写道也能给我发一份翻译文档? 邮件437 ...
Hazelcast 参考文档-4 -
u011577913:
也能给我发一份翻译文档?
Hazelcast 参考文档-4 -
songzj001:
DbUnit入门实战
Some time ago I worked on a project which was in need of a way to cluster databases. For those of you who don’t exactly know what database clustering is: database clustering is a way to have multiple databases work together to act like a single database. A cluster of databases typically has the following benefits:
- A cluster has a query throughput that is much higher than just a single database.
- A database in the cluster can fail without losing access to the data.
The result is that you now have access to a ‘database’ that is never down and can handle a lot more queries than a single database could. We didn’t care much about the higher query throughput but were very interested in a database that would always be available.
After some research I found out that there were a lot of products available to accomplish this.
We decided to use HA-JDBC (= High Availability Java DataBase Connectivity), an open source Java framework that offers database clustering. In this blog entry, I will tell you about my experiences with this relatively unknown framework, and hopefully share experiences with people who also use it. But first, I will try to explain how this framework does its job.
HOW HA-JDBC WORKS
Normally when you connect to a database, you use a JDBC connection. HA-JDBC wraps one or more connections and acts like a proxy in your javacode. This means that your javacode interacts with the proxy and is transparently communicating with multiple databases.
In order for HA-JDBC to know which connections to proxy, it needs an XML configuration file. This file defines how the cluster is configured. It defines the information needed to connect to the databases, but also the behaviour of the cluster itself. Below you can read about the experiences I’ve had with a few different aspects of HA-JDBC.
SYNCHRONIZATION
One aspect of HA-JDBC is synchronization. Synchronization in HA-JDBC is the process of making an out-of-date database up-to-date again by comparing its data with other databases in the cluster. A database can get out-of-date when it gets turned off for some reason. It could have died on its own or someone could have turned it off on purpose, but in any case: it needs synchronization to have its data up-to-date again. When a database goes down, users of the application will not notice. They can continue using the application like nothing is going on because other databases of the cluster are still available.
When a database is down, it will rapidly get out-of-date because it will not process any updates anymore. When the database is started again, HA-JDBC will pick this up and executes a synchronization strategy that is configured in the XML configuration file. The downside to this is that the built-in strategies of HA-JDBC are not very efficient. These are the ones packaged with HA-JDBC:
- <!-- [if !supportLists]--><!-- [endif]-->FullSynchronizationStrategy : Deletes the content of all tables of the database that is being updated and fills them again with data from a different (up-to-date) database in the cluster.
- <!-- [if !supportLists]-->DifferentialSynchronizationStrategy: Compares all rows of the out-of-date database with a different (up-to-date) database in the cluster to find out which rows need to be updated, inserted or deleted.
- <!-- [if !supportLists]-->PassiveSynchronizationStrategy : There also is a strategy that assumes no updates have taken place during the down-time and is therefore doing nothing.
A nice thing about HA-JDBC is that you can implement your own synchronization strategy. Since the above mentioned strategies will take hours to complete on large tables we decided to write a strategy ourselves. This strategy requires tables to have a timestamp for versioning, does not support deletes but turned out to be a lot faster than the built-in strategies of HA-JDBC.
ID GENERATION
When using a single database you can let the database itself generate IDs for records you insert. When using multiple databases with HA-JDBC you can still do that, but there is no guarantee that all databases in the cluster will generate the same ID. When a different ID is generated in each database, this will leave your cluster in an invalid state because now all the databases in this cluster contain different data.
Of course there is a solution to this problem but this isn’t pretty. When using an ORM tool like Hibernate , you can specify a generator for the ID field. By default Hibernate makes the database responsible for generating IDs which is not what we want. When using HA-JDBC you should use one of the following generators:
- UUID-generator
- HiLo-generator.
These two generators both don’t depend on an individual database, which is just what we need, but they do not produce normal IDs. For example the UUID generator generates IDs like ‘4028828d-0dc7f2a2-010d-c7f2a4d3-0013’. This value is based on the current timestamp and the IP address of the machine the application is on. The HiLo generator generates normal numbers but they don’t increase like you’re used to. It is possible that the first number it generates is 432 and the next one is 33200, which you wouldn’t expect from IDs.
NON-INTRUSIVE?
When I started using
HA-JDBC, I expected it to be non-intrusive to our project. Because the
only thing we needed was to change our JDBC driver and write a simple
XML configuration file for it. But as you have read in this blog entry,
you first of all need to write your own strategy for synchronization,
and second of all, you probably have to switch to UUIDs. This requires
a lot of refactoring all over your code because you are now switching
from Long typed IDs to String typed IDs. So in fact it does influence
your project more than you would expect.
CONCLUSION
In conclusion, HA-JDBC is very easy to set up and has well written documentation on its website. It performs quite well, especially when writing a customized strategy for synchronization. Since it delegates calls to underlying JDBC drivers directly, it is fast and has full JDBC support. You also don’t need anything else then just your database servers and your application servers. But there are a few issues with HA-JDBC that are a bit annoying, you will probably end up with having UUIDs for records in your database and having to write your own synchronization strategy.
I was wondering if there were any other people that have some experience with this database clustering approach and would like to share their experiences. So if you have any experience with HA-JDBC, don’t hesitate to leave a comment!
发表评论
-
oracle复制表数据,复制表结构
2011-07-25 21:19 361951.不同用户之间的表数据复制 对于在一个数据库上的两个 ... -
删除 SQL Server 的所有已知实例
2011-04-05 18:42 2028如果提示实例已经被注册,无法安装,那么: 删除 SQL ... -
【SQL】安装 SQL SERVER MsiGetProductInfo 无法检索 Product Code 1605错误 解决方案
2011-04-05 17:10 4843重装数据库服务器上的SQL SERVER 2008 上遇到了以 ... -
Mysql Using Master/Slave Replication with ReplicationConnection
2011-03-24 15:19 1983Starting with Connector/J 3.1.7 ... -
oracle网络配置listener.ora、sqlnet.ora、tnsnames.ora
2010-12-03 12:36 33334oracle网络配置 三个配置文件 listener.ora ... -
Oracle XE的数据库创建过程
2010-12-02 22:55 4339今天安装了Oracle XE,发现并没有自动创建数据库。趁着 ... -
实现数据库TPC性能测试的开源及商业软件
2010-12-02 01:11 3118商业软件 Benchmark Factory ... -
MySQL压力测试工具mysqlslap
2010-11-07 17:13 1767MySQL从5.1.4版开始带有一个压力测试工具mys ... -
一台机器上安装多个mysqld实例
2010-11-06 16:09 2165一台机器安装多个mysqld实例 1. ps -aux | ... -
MySQL数据库双向同步
2010-08-25 20:23 40711. 主从关系的同步 master端 192.168.5 ... -
MySQL 数据库之间的同步(windows与linux)
2010-08-25 20:14 32051.导出windows mysql的test库到linux m ... -
Ubuntu Server 下开启远程连接 MySQL
2010-03-16 23:24 3104要通过远程连接MySQL,需要做两步:第一步是要创建一个可以远 ... -
DB2 在REDHAT 5下的详细安装过程 DB2 9.5 C EXPRESS
2010-03-02 10:43 4582过详细测试并且补充后发表, 括号内的为自行添加的内容. ... -
IBM DB2 Express-C 9.5.2
2010-03-02 09:44 3509或许您已经知 ... -
Oracle Database 10g Express Edition安装小结
2010-03-01 15:28 9165racle Database 10g Express Edit ... -
数据归档将走向何方
2010-02-25 11:50 2263数据量的爆炸性增长,让我们不得不更加关心存储。这也造成 ... -
Database
2010-02-18 15:53 2313下一代数据库发展的4大趋势 趋势之一:对XML的支 ... -
免安装Oracle运行pl/sql developer
2010-02-16 20:27 2181Sql客户端中,虽然最便捷的是万能而且轻量无比的Sql Wor ... -
在debian上安装oracle 10g express
2010-02-16 18:46 4919在debian上安装oracle 10g express 若 ... -
Oracle 数据库 10g 特别版:并非只适合初学者
2010-02-16 17:24 2067作者:Lewis Cunningham ...
相关推荐
【ha-jdbc.rar】是一个压缩包文件,其中包含的【ha-jdbc】 jar包是针对Java平台的一个数据库连接工具,主要用于实现高可用性(High Availability)和负载均衡(Load Balancing)的Java Database Connectivity(JDBC...
【ha-jdbc入门demo】是针对高可用性(High Availability, HA)数据库连接技术的一次实践,主要聚焦在如何利用ha-jdbc实现数据库的高可用和负载均衡。在这个入门示例中,我们将深入理解ha-jdbc的工作原理,以及如何...
高可用性(High Availability, HA)是确保系统在面临硬件故障、软件错误或其他中断时仍能持续运行的关键特性。Oracle提供了多种HA解决方案,如Real Application Clusters (RAC),它允许多台服务器共享同一个数据库...
HA-JDBC,全称High Availability JDBC,它通过在原有JDBC驱动之上增加一层中间件,实现了对数据库连接的管理和监控,确保在分布式环境中应用的稳定性和数据的一致性。 描述中提到,HA-JDBC能够为任何基础JDBC驱动...
- `jobmanager.high-availability`:设置为`zookeeper`,表明使用ZooKeeper进行高可用协调。 - `high-availability.zookeeper.quorum`:指定ZooKeeper集群的地址。 - `state.backend`:选择状态后端,如`rocksdb`...
- MHA HA:Master High Availability,是提高数据库主从复制高可用性的工具。 - MySQL Fabric:是一个用于管理和伸缩MySQL数据库架构的组件,已停止开发。 - MariaDB Replication Manager (MRM):是MariaDB的复制...
在IT行业中,构建高可用性(High Availability, HA)系统是确保服务连续性和稳定性的重要手段。Keepalived作为一款开源的高可用性工具,广泛应用于网站和数据库集群中,为业务提供持续的服务保障。本篇将详细介绍...
首先,我们来看Hadoop HA(High Availability)集群的安装。Hadoop HA提供了一种高可用性解决方案,确保即使主NameNode故障,系统也能继续运行。在`hadoop HA集群安装文档1.0.docx`中,应详细介绍了如何配置两个...
H2H可能是H2 High Availability的缩写,指的是H2数据库的一个高可用性解决方案。H2是一个开源的、高性能的关系型数据库管理系统,广泛用于开发和测试环境中。JDBC代理驱动程序则是一个中间件,它能够透明地在多个...
系统设计方面,机场道面巡检系统采用了B/S架构,运用Flex富客户端技术、高精度GPS定位技术、J2EE技术、中间件技术、数据库HA(High Availability,高可用性)以及地图切片缓存等技术。Apache Tomcat作为中间件部署在...
VIP漂移是基于HA同步软件,如MHA(Master High Availability)和MMM(Master-Master Replication Manager for MySQL),它们监控MySQL状态并在主节点故障时自动切换VIP。API调用方式则更灵活,允许应用程序直接控制...
在实际部署中,还需要考虑高可用性、安全性、性能优化等多个方面,例如设置Hadoop的HA(High Availability)、Hive的分区策略、HBase的Region拆分策略等。在大数据环境中,正确配置和优化这些组件至关重要,它们直接...
- 无单点故障:所有组件都支持HA(High Availability),包括数据节点HA和协调节点多活,以及GTM全局事务节点HA。 - 在线扩容:基于Node Group技术,可以在不影响业务的情况下进行扩容,支持数据操作和DDL操作。 ...
高可用(High Availability,简称HA)意味着通过设计减少系统的停机时间,提高服务的持续性和稳定性。文章可能涉及到了如何设计高可用架构、常用的HA策略和解决方案,以及如何在分布式系统中实施。对于从事互联网...
\n\n**高可用性(HA)**:HA是指通过冗余硬件或软件设计,确保即使在部分组件故障时,系统仍能持续提供服务。数据库中间件通常会包含故障切换机制,当主库出现问题时,能快速切换到备用库,保持服务不间断。\n\n**分库...
- **HA-HDFS介绍**:HDFS高可用性(High Availability)的介绍。 - **HA集群搭建**:搭建HDFS HA集群。 - **HDFS-开发环境搭建**:搭建HDFS开发环境。 - **HDFS-开发API讲解**:HDFS提供的API介绍。 #### FastDFS ...
【标题】中的“双活”通常在IT领域指的是高可用性(High Availability, HA)的架构设计,特别是在数据库或存储系统中,通过在两个不同的地理位置同时运行相同的应用或服务,确保即使在一个站点出现故障时,另一个站点...
* catalina-ha.jar (High availability package) * catalina-storeconfig.jar (Generation of XML configuration from current state) * catalina-tribes.jar (Group communication) * ecj-4.4.jar (Eclipse JDT ...
* catalina-ha.jar (High availability package) * catalina-tribes.jar (Group communication) * ecj-4.4.jar (Eclipse JDT Java compiler) * el-api.jar (EL 2.2 API) * jasper.jar (Jasper 2 Compiler and ...
- **知识点**:High Availability (HA)产生的背景。 - **详细解析**:HA机制的出现是为了提高系统的可用性,确保在发生故障时能够快速切换,保持服务的连续性。 39. **网络管理任务** - **知识点**:网络管理的...