`

Berkeley DB (八) -- DB Replication (HA)下部

    博客分类:
  • DB
阅读更多
Berkeley DB -- DB Replication (HA)下部


Network partitions

bdb replication 的实现可能被网络隔离的问题影响。


例如,考虑replication组有n个成员。网络隔离让master在一边,多于一半(n/2)的站点在另外一边。和master在一边的站点将继续前进,master继续接受数据库的写请求。不幸的是,隔离在另一边的站点,意思到他们的master不在了,将举行一个选举。这个选举将取得成功,因为这儿有总数n/2以上的站点在这边,然后这个组内将会有两个master。既然两个master都可能潜在地接受写请求,那么数据库将可能产生分歧,使得数据不一致。

如果曾经在一个组内发现了多个master,一个master检测到这个问题的时候将会返回DB_REP_DUPMASTER。如果一个应用程序看到这个返回,它应该重新配置自己作为一个client(通过调用ENV->rep_start),然后发起一场选举(通过调用DB_ENV->rep_elect)。赢得这次选举的可能是先前的两个master之一,也可能完全就是另外的站点。无论如何,这个胜出的系统将引导其它系统达到一致。

作为另外一个例子,考虑一个replication组有一个master环境和两个client,A和B,在那A可能会升级为master地位而B不可能。然后,假设client A从其他的两个数据库环境中被隔离出来了,它的数据变的过期。然后假设这个master倒掉了,而且不再上线。随后,网络隔离被修复了,client A和B进行了一次选举。因为client B不能赢得选举,client A将会默认地赢得这次选举,为了重新和B同步,可能在B上提交的事务将不能回滚直到这两个站点能再次地一起前进。

在这两个例子中,都有一步就是新选举出的master引导组内的成员和它自己一致,以便它可以开始发送新信息给它们。这可能会丢失信息,因为以前提交的事务没有回滚。

在体系结构上网络隔离是个问题,应用程序可能想实现一个心跳协议以最小化一个糟糕的网络隔离的影响。只要一个master至少可以和组内一半的站点通信的时候,就不可能出现两个master。如果一个master不再能和足够的站点取得联系的时候,它应该重新配置自己作为一个client,和举行一次选举。

这儿有另外一个工具应用程序可以用来最小化网络隔离情况下的损失。通过指定一个 nsites 参数给DB_ENV->rep_elect ,也就是说,比组内的实际成员的数目大,应用程序可以阻止系统宣布他们自己成为master,除非它们可以和组内绝大部分站点通话。例如,如果组内有20个数据库环境,把参数30指定给DB_ENV->rep_elect方法,那么这个系统至少要和16个站点通话才可以宣布自己为master。

指定一个小于组内世界成员数目的nsites参数给DB_ENV->rep_elect,也有它的用处。例如,考虑一个组有只有两个数据库环境。如果他们被隔离了,其中任何一个都不能取得足够的选票数成为master。一个合理的选择是,指定一个系统的nsites 参数为2,另一个为1。那样,当被隔离的时候,其中一个系统可以赢得选举权,而另一个不能。这能允许当网络被隔离的时候其中一个系统能继续接受写请求。

这些关卡强调了bdb replicated环境中好的网络底层构造的重要性。当replicating数据库环境在严重丢包的网络环境中,最好的解决可能是拣选一个单一的master,只有当人工干涉决定这个被选择的master不能再恢复上线时,才举行选举。

Replication FAQ


Does Berkeley DB provide support for forwarding write queries from clients to masters?
No, it does not. The Berkeley DB RPC server code could be modified to support this functionality, but in general this protocol is left entirely to the application. Note, there is no reason not to use the communications channels the application establishes for replication support to forward database update messages to the master, since Berkeley DB does not require those channels to be used exclusively for replication messages.


Can I use replication to partition my environment across multiple sites?
No, this is not possible. All replicated databases must be equally shared by all environments in the replication group.


I'm running with replication but I don't see my databases on the client.
This problem may be the result of the application using absolute path names for its databases, and the pathnames are not valid on the client system.


How can I distinguish Berkeley DB messages from application messages?
There is no way to distinguish Berkeley DB messages from application-specific messages, nor does Berkeley DB offer any way to wrap application messages inside of Berkeley DB messages. Distributed applications exchanging their own messages should either enclose Berkeley DB messages in their own wrappers, or use separate network connections to send and receive Berkeley DB messages. The one exception to this rule is connection information for new sites; Berkeley DB offers a simple method for sites joining replication groups to send connection information to the other database environments in the group (see Connecting to a new site for more information).


How should I build my send function?
This depends on the specifics of the application. One common way is to write the rec and control arguments' sizes and data to a socket connected to each remote site. On a fast, local area net, the simplest method is likely to be to construct broadcast messages. Each Berkeley DB message would be encapsulated inside an application specific message, with header information specifying the intended recipient(s) for the message. This will likely require a global numbering scheme, however, as the Berkeley DB library has to be able to send specific log records to clients apart from the general broadcast of new log records intended for all members of a replication group.


Does every one of my threads of control on the master have to set up its own connection to every client? And, does every one of my threads of control on the client have to set up its own connection to every master?
This is not always necessary. In the Berkeley DB replication model, any thread of control which modifies a database in the master environment must be prepared to send a message to the client environments, and any thread of control which delivers a message to a client environment must be prepared to send a message to the master. There are many ways in which these requirements can be satisfied.

The simplest case is probably a single, multithreaded process running on the master and clients. The process running on the master would require a single write connection to each client and a single read connection from each client. A process running on each client would require a single read connection from the master and a single write connection to the master. Threads running in these processes on the master and clients would use the same network connections to pass messages back and forth.

A common complication is when there are multiple processes running on the master and clients. A straight-forward solution is to increase the numbers of connections on the master -- each process running on the master has its own write connection to each client. However, this requires only one additional connection for each possible client in the master process. The master environment still requires only a single read connection from each client (this can be done by allocating a separate thread of control which does nothing other than receive client messages and forward them into the database). Similarly, each client still only requires a single thread of control that receives master messages and forwards them into the database, and which also takes database messages and forwards them back to the master. This model requires the networking infrastructure support many-to-one writers-to-readers, of course.

If the number of network connections is a problem in the multiprocess model, and inter-process communication on the system is inexpensive enough, an alternative is have a single process which communicates between the master the each client, and whenever a process' send function is called, the process passes the message to the communications process which is responsible for forwarding the message to the appropriate client. Alternatively, a broadcast mechanism will simplify the entire networking infrastructure, as processes will likely no longer have to maintain their own specific network connections.
分享到:
评论
1 楼 yb19860723 2010-01-04  

楼主 为了回你这个帖 我整了10分钟小测试 论坛这招真神!~

不过还是要感谢楼主英明神武的帖

无限崇拜...

相关推荐

    BerkeleyDB-Core-C-GSG.pdf

    ### Berkeley DB (C) 开发入门与核心技术解析 #### 概述 Berkeley DB(简称 BDB)是一款高性能的关键值存储数据库系统,被广泛应用于多种操作系统之上,支持多种访问方法和事务处理机制。该文档主要介绍了如何使用...

    berkeley db db-6.1.26.tar.gz

    db-6.1.26.tar.gz berkeley db

    building-ha-scalable-applications-with-berkeley-db-whitepaper

    Berkeley DB 高可用性(BDB-HA)作为一款嵌入式数据管理系统的杰出代表,以其卓越的性能、可靠性与可扩展性,在众多领域得到了广泛应用。本篇白皮书旨在深入探讨 BDB-HA 的核心特性,以及如何最佳利用其解决特定的...

    Berkeley DB JE-7.0.6

    **Berkeley DB JE 7.0.6:深入理解分布式数据存储** Berkeley DB JE(Java Edition)是Oracle公司提供的一款开源、嵌入式、基于Java的键值对数据库系统。它以其轻量级、高性能和高可用性而受到广泛的青睐,尤其适合...

    Berkeley DB -- Access Method Configuration_iyangjian200599

    (二) Berkeley DB -- Access Method Configuration_iyangjian2005997_新浪博客.mht

    Berkeley DB4.8以上各版本

    Berkeley DB是一款由Oracle公司开发的嵌入式数据库系统,被广泛应用于许多软件项目中,尤其是在需要快速、轻量级数据存储解决方案的场景下。它提供了键值对存储模式,适用于构建高性能的数据缓存和数据库应用程序。...

    Berkeley DB -- 入门知识和一个小例子_iyangjian2005997_新浪博客.mht

    Berkeley DB -- 入门知识和一个小例子_iyangjian2005997_新浪博客.mht

    db-4.7.25-master_db-4.7.25-master_berkeleydbvxworks_BerkeleyDB_源

    Berkeley DB是一个开源的文件数据库,介于关系数据库与内存数据库之间,使用方式与内存数据库类似,它提供的是一系列直接访问数据库的函数,而不是像关系数据库那样需要网络通讯、SQL解析等步骤,本文件是早期版本

    Berkeley DB C++编程入门教

    Berkeley DB是一个由Oracle公司开发的开源嵌入式数据库系统,支持多种编程语言接口,其中C++是其中之一。它为开发者提供了一个轻量级的数据库解决方案,适用于多种应用程序。Berkeley DB允许开发者在应用程序中直接...

    BerkeleyDB-Core-Cxx-GSG.rar_Berkeley DB_berkeley Db cxx

    《Berkeley DB核心技术指南——C++接口篇》 Berkeley DB(简称BDB)是由Oracle公司开发的一款开源、轻量级、嵌入式数据库系统,主要用于处理键值对存储问题。它广泛应用于需要快速访问数据的环境,如网络服务器、...

    berkeley db je-6.4.9.gz

    伯克利数据库(Berkeley DB,简称BDB)是由Oracle公司开发的一款开源、嵌入式、键值对存储的数据库管理系统。在这个“berkeley db je-6.4.9.gz”压缩包中,包含了BDB Java Edition(JE)的6.4.9版本。这个版本的发布...

    berkeley-db-v-relational-066565

    ### Berkeley DB与关系型数据库管理系统的对比分析 #### 引言 Oracle公司以其业界领先的数据库引擎——Oracle Database闻名于世。Oracle Database是一款极其可靠、高度可扩展的客户端-服务器关系型数据库管理系统...

    BerkeleyDB_java_jar包

    BerkeleyDB是一个开源的、基于键值对的嵌入式数据库系统,由Oracle公司提供。它为Java开发者提供了丰富的API,使得在Java应用程序中轻松集成数据存储成为可能。标题中的"BerkeleyDB_java_jar包"指的是适用于Java开发...

    BerkeleyDB-JE je-6.0.11

    Oracle BerkeleyDB-JE je-6.0.11

    BerkeleyDB-Core-Cxx-GSG.rar

    **Berkeley DB (BDB)** 是一款开源的、嵌入式数据库系统,由Oracle公司提供。它被广泛用于需要高效本地存储和简单数据管理的软件应用程序中,特别是在那些对性能和可靠性有高要求的场景。BDB的核心特性包括事务处理...

    Berkeley DB参考手册PDF版本

    DB Replication(HA)下部 - **13.1 Network partitions** - **网络分区**:探讨在网络分隔的情况下如何维持复制环境的正常运行。 - **13.2 Replication FAQ** - **常见问题解答**:列出并解答关于Berkeley DB...

    Berkeley DB参考资料

    DB Replication (HA)下部 - **13.1 Network partitions** - **网络分区**:处理复制集群中可能出现的网络故障情况。 - **13.2 Replication FAQ** - **常见问题解答**:针对复制集群的一些常见问题及其解答。 #...

    BerkeleyDB-Core-JAVA-GSG.pdf

    ### Berkeley DB for Java:概述与入门指南 #### 一、Berkeley DB简介 Berkeley DB (BDB) 是一个高性能的嵌入式数据库系统,它以其高效的数据存储和检索能力而闻名。根据提供的文档信息,“BerkeleyDB-Core-JAVA-...

    BerkeleyDB-0.26

    **BerkeleyDB** 是一款由 Oracle 公司开发的开源、高性能、无模式的键值对存储数据库系统。它在嵌入式环境和轻量级应用程序中被广泛使用,尤其适用于那些需要快速数据访问和简单数据管理的应用。BerkeleyDB 的设计...

Global site tag (gtag.js) - Google Analytics