Contents
Apache ZooKeeper is a highly reliable and available service that provides coordination between distributed processes.
Upgrading ZooKeeper to the Latest CDH3 Release
Cloudera recommends that you use arolling upgradeprocess to upgrade ZooKeeper: that is, upgrade one server in the ZooKeeper ensemble at a time. This means bringing down each server in turn, upgrading the software, then restarting the server. The server will automatically rejoin the quorum, update its internal state with the current ZooKeeper leader, and begin serving client sessions.
This method allows you to upgrade ZooKeeper without any interruption in the service, and also lets you monitor the ensemble as the upgrade progresses, and roll back if necessary if you run into problems.
The instructions that follow assume that you are upgrading ZooKeeper as part of an upgrade to the latest CDH3 release, and have already performed the steps underUpgrading CDH3.
Performing a ZooKeeper Rolling Upgrade
Follow these steps to perform a rolling upgrade.
Step 1: Stop the ZooKeeper Server on the First Node
To stop the ZooKeeper server:
$ sudo /sbin/service hadoop-zookeeper-server stop |
or
$ sudo /sbin/service hadoop-zookeeper stop |
depending on the platform and release.
Step 2: Install the ZooKeeper Base Package on the First Node
SeeInstalling the ZooKeeper Base Package.
Step 3: Install the ZooKeeper Server Package on the First Node
SeeInstalling the ZooKeeper Server Package.
Step 4: Re-enable the Server
Because of a packaging problem in earlier releases, you need to re-enable the server manually after upgrading ZooKeeper from CDH3 Update 1 or earlier to the latest CDH3 release:
$ sudo /sbin/chkconfig --add hadoop-zookeeper-server |
Step 5: Restart the Server
SeeInstalling the ZooKeeper Server Packagefor instructions on starting the server.
The upgrade is now complete on this server and you can proceed to the next.
Step 6: Upgrade the Remaining Nodes
Repeat Steps 1-5 above on each of the remaining nodes.
The ZooKeeper upgrade is now complete.
Installing the ZooKeeper Packages
There are two ZooKeeper server packages:
- Thehadoop-zookeeperbase package provides the basic libraries and scripts that are necessary to run ZooKeeper servers and clients. The documentation is also included in this package.
- Thehadoop-zookeeper-serverpackage contains theinit.dscripts necessary to run ZooKeeper as a daemon process. Becausehadoop-zookeeper-serverdepends onhadoop-zookeeper, installing the server package automatically installs the base package.
Installing the ZooKeeper Base Package
To install ZooKeeper on Ubuntu and other Debian systems:
$ sudo apt-get install hadoop-zookeeper |
To install ZooKeeper On Red Hat-compatible systems:
$ sudo yum install hadoop-zookeeper |
To install ZooKeeper on SUSE systems:
$ sudo zypper install hadoop-zookeeper |
Installing the ZooKeeper Server Package and Starting ZooKeeper on a Single Server
The instructions provided here deploy a single ZooKeeper server in "standalone" mode. This is appropriate for evaluation, testing and development purposes, but may not provide sufficient reliability for a production application. SeeInstalling ZooKeeper in a Production Environmentfor more information.
To install a ZooKeeper server on Ubuntu and other Debian systems:
$ sudo apt-get install hadoop-zookeeper-server |
To install ZooKeeper On Red Hat-compatible systems:
$ sudo yum install hadoop-zookeeper-server |
To install ZooKeeper on SUSE systems:
$ sudo zypper install hadoop-zookeeper-server |
To start ZooKeeper
Use the following command to start ZooKeeper:
$ sudo /sbin/service hadoop-zookeeper-server start |
Installing ZooKeeper in a Production Environment
For use in a production environment, you should deploy ZooKeeper as an ensemble with an odd number of nodes. As long as a majority of the servers in the ensemble are available, the ZooKeeper service will be available. The minimum recommended ensemble size is three ZooKeeper servers, and it is recommended that each server run on a separate machine.
ZooKeeper deployment on multiple servers requires a bit of additional configuration. The configuration file (zoo.cfg) on each server must include a list of all servers in the ensemble, and each server must also have amyidfile in its data directory (by default/var/zookeeper) that identifies it as one of the servers in the ensemble.
For instructions describing how to set up a multi-server deployment, seeInstalling a Multi-Server Setup.
Setting up Supervisory Process for the ZooKeeper Server
The ZooKeeper server is designed to be both highly reliable and highly available. This means that:
- If a ZooKeeper server encounters an error it cannot recover from, it will "fail fast" (the process will exit immediately)
- When the server shuts down, the ensemble remains active, and continues serving requests
- Once restarted, the server rejoins the ensemble without any further manual intervention.
Cloudera recommends that you fully automate this process by configuring a supervisory service to manage each server, and restart the ZooKeeper server process automatically if it fails. See theZooKeeper Administrator's Guidefor more information.
Maintaining a ZooKeeper Server
The ZooKeeper server continually saves znode snapshot files and, optionally, transactional logs in a Data Directory to enable you to recover data. It's a good idea to back up the ZooKeeper Data Directory periodically. Although ZooKeeper is highly reliable because a persistent copy is replicated on each server, recovering from backups may be necessary if a catastrophic failure or user error occurs.
The ZooKeeper server does not remove the snapshots and log files, so they will accumulate over time. You will need to cleanup this directory occasionally, based on your backup schedules and processes. To automate the cleanup, azkCleanup.shscript is provided in thebindirectory of thehadoop-zookeeperbase package. Modify this script as necessary for your situation. In general, you want to run this as a cron task based on your backup schedule.
The data directory is specified by thedataDirparameter in the ZooKeeperconfiguration file, and the data log directory is specified by thedataLogDirparameter.
For more information, seeOngoing Data Directory Cleanup.
Viewing the ZooKeeper Documentation
For additional ZooKeeper documentation, seehttp://archive.cloudera.com/cdh/3/zookeeper/.
相关推荐
1. hadoop@master:~/installation/zookeeper-3.3.4$ bin/zkCli.sh -server dynamic:2181 2. Connecting to dynamic:2181 3. 2012-01-08 21:30:06,178 - INFO [main:Environment@97] - Client environment:zookeeper....
- **hive_installation and load data.doc**:这份文档可能介绍了如何安装Hive以及如何加载数据到Hive仓库的步骤。 - **Hive理论_Hive-基于MapReduce框架的数据仓库解决方案_ZN.doc**:此文档可能深入讨论了Hive的...
, HCatalog, Pig, Hive, HBase, ZooKeeper and Ambari. Hortonworks is the major contributor of code and patches to many of these projects. These projects have been integrated and tested as part of the ...
Hortonworks数据平台包含了一系列Apache Hadoop项目的核心集合,包括MapReduce、Hadoop分布式文件系统(HDFS)、HCatalog、Pig、Hive、HBase、ZooKeeper和Ambari。Hortonworks是这些项目代码和补丁的主要贡献者。...
1. **ZooKeeper API封装**:grapetree可能提供了一组Python接口,使得与ZooKeeper的交互变得简单,包括创建、读取、更新和删除ZNode(ZooKeeper中的数据节点)。 2. **分布式协调**:利用ZooKeeper的服务发现和...
3. **ZooKeeper**:Kafka集群依赖于ZooKeeper进行协调管理,因此在安装Kafka前需要准备好ZooKeeper服务或者配置单机模式下的ZooKeeper。 #### 三、安装Java JDK Kafka需要Java环境支持,因此首先要安装JDK。这里以...
在Python库的标签中提到了“zookeeper”、“分布式”和“云原生”,这可能暗示PySpice在现代分布式系统和云计算环境中的应用潜力。Zookeeper是一种分布式协调服务,用于管理分布式系统的配置信息、命名服务等。在大...
此外,考虑到“数据库”和“分布式”的标签,你可能还需要关注数据库管理和集群配置的相关知识,如ZooKeeper的集成和使用。这确保了在没有互联网连接的情况下,也能在Debian环境中搭建可靠的MariaDB数据库系统。
, HCatalog, Pig, Hive, HBase, ZooKeeper and Ambari. Hortonworks is the major contributor of code and patches to many of these projects. These projects have been integrated and tested as part of the ...
8. **设置 root 密码**:通过 `mysql_secure_installation` 指令进行设置。 9. **开启 MariaDB 远程连接**: - 连接到 MariaDB:`mysql -u root -p` - 修改 root 用户的主机地址:`use mysql; update user set ...
该平台由 Apache Hadoop 项目的基本集合组成,包括 MapReduce、HDFS、HCatalog、Pig、Hive、HBase、ZooKeeper 和 Ambari。Hortonworks 数据平台旨在以一种非常快速、简单和经济有效的方式处理来自许多源和格式的数据...
使用`yum`安装MySQL Server,创建数据库,设置root用户的密码,并通过`mysql_secure_installation`脚本来增强安全性。记得启动MySQL服务并设置开机启动。 4. **Nginx 1.9.15**:Nginx是一款高性能的HTTP和反向代理...
基于马尔可夫链的信用卡欺诈检测的示例实现 ... 为了运行我们的示例,我们需要安装以下内容: 构建示例: $ sbt assembly - this creates [installation_dir]/...这也将启动附加的 zookeeper 实例。 启动卡夫卡。 这将
有一段时间没用sqoop了,今天...Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /opt/module/sqoop/bin/…/…/accumulo does not exist! Accumulo imports will fail. Please set $ACCU
├── 01-installation-scripts │ ├── 01-MySQL │ ├── 02-Zabbix │ ├── 03-Jumpserver │ ├── 04-Docker │ ├── 05-Jenkins │ ├── 06-Gitlab │ ├── 07-Nginx-tengine-openresty-...
这些项目包括MapReduce、HDFS(Hadoop分布式文件系统)、HCatalog、Pig、Hive、HBase、ZooKeeper等。Hortonworks Data Platform(HDP)是基于Apache Hadoop构建的,一个完全开源且大规模可扩展的平台,用于存储、...
## Installation download, uncompress ## Getting Started ```bash pip install -r requirement.txt vim cluster.conf ``` ``` # cluster name alias [kafka1003] # zookeeper zk = 127.0.0.1:2128/kafka1003 ...