zookeeper 的一个节点启动时候报错:
2011-02-10 15:16:24,128 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2188:FastLeaderElection@683] - Notification time out: 60000
2011-02-10 15:16:24,128 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2188:FastLeaderElection@689] - Notification: 2, 0, 1, 2, LOOKING, LOOKING, 2
2011-02-10 15:16:24,128 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (3, 2)
2011-02-10 15:16:24,129 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2188:FastLeaderElection@799] - Notification: 9, 8589934592, 3, 2, LOOKING, FOLLOWING, 0
2011-02-10 15:16:24,129 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2188:FastLeaderElection@799] - Notification: 9, 8589934592, 3, 2, LOOKING, FOLLOWING, 1
2011-02-10 15:16:24,129 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (4, 2)
2011-02-10 15:16:24,130 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (5, 2)
2011-02-10 15:16:24,131 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (6, 2)
2011-02-10 15:16:24,132 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (7, 2)
2011-02-10 15:16:24,132 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (8, 2)
2011-02-10 15:16:24,133 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (9, 2)
2011-02-10 15:16:24,134 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connec
可能是zookeeper 的一个bug 。
狂google 就找到如下一个像样的说法:
Thanks for detailed assessment, Vishal. In Step b, the fact that the process believes it is the leader is not a problem, and it happens because we queue notification messages during leader election.
The real issue is that leader code is setting the last processed zxid to the first of the new epoch even before connecting to a quorum of followers. Because the leader code sets this value before connecting to a quorum of followers (Leader.java:281) and the follower code throws an IOException (Follower.java:73) if the leader epoch is smaller, we have that when the false leader drops leadership and becomes a follower, it finds a smaller epoch and kills itself.
I noticed that this follower check was not there before (not present in 3.0 branch), and it might have been introduced when we did the observer reorganization. For now I propose that we move line Leader.java:281 to Leader.java:470. It simply changes the point in which we set the last processed zxid to one in which we know that a quorum of followers supports the leader. I reasoned a bit about it and verified that tests pass.
A patch for the change I'm proposing is trivial, but a unit test will require some work, so I'd rather hear opinions first. Also, please note that this problem is not related to the topic of this jira, so we might consider working on a different jira from this point on.
可由于我用的是cloudera 的zookeeper 所以不能用其最近版(CDH3 没用zookeeper的最新版)
最后我的解决方法是:
重启系统。。。。
分享到:
相关推荐
《Zookeeper实战:ConfigServer代码样例解析》 在分布式系统中,Zookeeper作为一个高可用的分布式协调服务,被广泛应用于配置管理、命名服务、分布式锁等场景。本篇文章将聚焦于Zookeeper的一个典型应用——Config...
unable to connect to ZooKeeper server解决方案(亲测可用)
**Zookeeper 概述** Zookeeper 是一个分布式协调服务,由 Apache 开发并维护,它为分布式应用程序提供了高效且可靠的命名服务、配置管理、集群同步、分布式锁等核心功能。Zookeeper 的设计目标是简化分布式环境下的...
5. **FAILED TO WRITE PID/zookeeper_server.pid: No such file or directory** 这个错误提示无法写入PID文件,通常是因为指定的路径不存在。解决方法: - 检查`zoo.cfg`配置文件中的`dataDir`设置,确保路径正确...
在Zookeeper中,有三个重要的命令行工具,分别是`zkServer.sh`、`zkCli.sh`和四字命令。这些工具对于管理和监控Zookeeper集群至关重要。 **一、zkServer.sh** `zkServer.sh`是Zookeeper服务的控制脚本,用于启动、...
server程序入口(启动类在zookeeper-server/src/main/java文件夹中,org.apache.zookeeper.server.quorum.QuorumPeerMain),program argument为conf/zoo.cfg,将conf/log4j.properties配置拷贝到zookeeper-server/...
java -classpath .:slf4j-api-1.7.2.jar:zookeeper-3.4.6.jar org.apache.zookeeper.server.LogFormatter /var/lib/zookeeper/version-2/log.1 ##window的bat批量方式 @echo off echo 查看zookeeper日志: set /...
server.1=/zookeeper/server001:2888:3888 server.2=/zookeeper/server002:2888:3888 server.3=/zookeeper/server003:2888:3888 server.4=/zookeeper/server004:2888:3888 server.5=/zookeeper/server005:2888:...
authProvider.x509=org.apache.zookeeper.server.auth.X509AuthenticationProvider ``` 5. **性能调优**:包括心跳间隔(tickTime)、会话超时时间、数据同步策略等。例如: ```properties initLimit=5 ...
在每台机器上执行 ./zkServer.sh start 命令启动 ZooKeeper 节点。启动 ZooKeeper 之后,可以执行 ./zkServer.sh status 查看当前节点在集群里的角色。 ZooKeeper 常用命令 1. 启动 ZK 服务: ./zkServer.sh start ...
2. **服务器端**:包括ServerCnxnFactory(连接工厂)、ZooKeeperServer(服务器主体)以及Leader/Follower/Observer(不同角色的服务器节点)的实现。 3. **客户端**:包括ClientCnxn(客户端连接)和ZooKeeper...
《Zookeeper:分布式服务治理的核心组件》 Zookeeper,作为Apache的一个开源项目,是分布式应用程序协调服务的基石,它是一个高可用、高性能的分布式一致性服务。在标题“zookeeper-3.4.6_zookeeper_”中,我们可以...
/usr/local/zookeeper/bin/zkServer.sh status ``` #### 四、注意事项 - 在升级过程中,需密切关注集群状态,确保各节点能够平稳过渡。 - 如果发现任何异常情况,应立即停止升级操作,并根据预案进行排查或回滚...
2.然后,在 ZooKeeper 的 bin 目录下执行以下命令,创建一个名为 `extends` 的节点:`./zkCli.sh -server <ip> create /zookeeper/extends 1` 3.接着,创建一个名为 `skip_limited_ip` 的节点:`./zkCli.sh -server ...
《ZooKeeper示例:实时更新server列表》 在分布式系统中,ZooKeeper作为一个可靠的分布式协调服务,常被用于管理、配置、命名、提供分布式锁等任务。本示例将探讨如何利用ZooKeeper实现实时更新server列表,以确保...
CentOS 8 安装 ZooKeeper 3.8.0 详细步骤 ZooKeeper 是一个分布式应用程序协调服务,提供了配置管理、名称服务、分布式同步和提供组服务等功能。下面是 CentOS 8 安装 ZooKeeper 3.8.0 的详细步骤。 1. 下载安装包...
4. **启动**:完成配置后,通过执行 `bin/zkServer.sh start` 来启动 Zookeeper 服务器。 5. **连接**:使用 `bin/zkCli.sh -server localhost:4180` 命令启动 Zookeeper 客户端并连接到服务器。 ### 伪集群模式...
* 启动ZooKeeper命令:zkServer.sh start conf/zoo.cfg * 连接ZooKeeper命令:zkCli.sh -server localhost:2181 ZooKeeper进程管理 * 使用top命令查看进程信息 * 使用jps命令查看Java进程信息 * 使用kill命令kill...
2.3 启动:运行bin/zkServer.cmd启动ZooKeeper服务,通过bin/zkCli.cmd命令行工具进行交互。 三、Linux上的ZooKeeper部署 3.1 安装:将解压后的Zookeeper-3.4.9目录放在指定位置,如/usr/local/zookeeper。 3.2 ...
在`src/main/java/org/apache/zookeeper/server/NIOServerCnxnFactory.java` 文件中,`acceptConnections()` 方法是接收客户端连接的地方。 - **添加IP检查逻辑**:在该方法内部,我们可以插入IP检查的代码,通过...