【转载地址:】
https://github.com/nathanmarz/storm/wiki/Setting-up-a-Storm-cluster
This page outlines the steps for getting a Storm cluster up and running. If you're on AWS, you should check out the storm-deploy project. storm-deploy completely automates the provisioning, configuration, and installation of Storm clusters on EC2. It also sets up Ganglia for you so you can monitor CPU, disk, and network usage.
If you run into difficulties with your Storm cluster, first check for a solution is in the Troubleshooting page. Otherwise, email the mailing list.
Here's a summary of the steps for setting up a Storm cluster:
Set up a Zookeeper cluster
Install dependencies on Nimbus and worker machines
Download and extract a Storm release to Nimbus and worker machines
Fill in mandatory configurations into storm.yaml
Launch daemons under supervision using "storm" script and a supervisor of your choice
Set up a Zookeeper cluster
Storm uses Zookeeper for coordinating the cluster. Zookeeper is not used for message passing, so the load Storm places on Zookeeper is quite low. Single node Zookeeper clusters should be sufficient for most cases, but if you want failover or are deploying large Storm clusters you may want larger Zookeeper clusters. Instructions for deploying Zookeeper are here.
A few notes about Zookeeper deployment:
It's critical that you run Zookeeper under supervision, since Zookeeper is fail-fast and will exit the process if it encounters any error case. See here for more details.
It's critical that you set up a cron to compact Zookeeper's data and transaction logs. The Zookeeper daemon does not do this on its own, and if you don't set up a cron, Zookeeper will quickly run out of disk space. See here for more details.
Install dependencies on Nimbus and worker machines
Next you need to install Storm's dependencies on Nimbus and the worker machines. These are:
ZeroMQ 2.1.7 - Note that you should not install version 2.1.10, as that version has some serious bugs that can cause strange issues for a Storm cluster. In some rare cases, users have reported an "IllegalArgumentException" bubbling up from the ZeroMQ code when using 2.1.7 – in these cases downgrading to 2.1.4 fixed the problem.
JZMQ
Java 6
Python 2.6.6
unzip
These are the versions of the dependencies that have been tested with Storm. Storm may or may not work with different versions of Java and/or Python.
If you have trouble installing ZeroMQ or JZMQ, see Installing native dependencies.
Download and extract a Storm release to Nimbus and worker machines
Next, download a Storm release and extract the zip file somewhere on Nimbus and each of the worker machines. The Storm releases can be downloaded from here.
Fill in mandatory configurations into storm.yaml
The Storm release contains a file at conf/storm.yaml that configures the Storm daemons. You can see the default configuration values here. storm.yaml overrides anything in defaults.yaml. There's a few configurations that are mandatory to get a working cluster:
1) storm.zookeeper.servers: This is a list of the hosts in the Zookeeper cluster for your Storm cluster. It should look something like:
storm.zookeeper.servers:
- "111.222.333.444"
- "555.666.777.888"
If the port that your Zookeeper cluster uses is different than the default, you should set storm.zookeeper.port as well.
2) storm.local.dir: The Nimbus and Supervisor daemons require a directory on the local disk to store small amounts of state (like jars, confs, and things like that). You should create that directory on each machine, give it proper permissions, and then fill in the directory location using this config. For example:
storm.local.dir: "/mnt/storm"
3) java.library.path: This is the load path for the native libraries that Storm uses (ZeroMQ and JZMQ). The default of "/usr/local/lib:/opt/local/lib:/usr/lib" should be fine for most installations, so you probably don't need to set this config.
4) nimbus.host: The worker nodes need to know which machine is the master in order to download topology jars and confs. For example:
nimbus.host: "111.222.333.44"
5) supervisor.slots.ports: For each worker machine, you configure how many workers run on that machine with this config. Each worker uses a single port for receiving messages, and this setting defines which ports are open for use. If you define five ports here, then Storm will allocate up to five workers to run on this machine. If you define three ports, Storm will only run up to three. By default, this setting is configured to run 4 workers on the ports 6700, 6701, 6702, and 6703. For example:
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
Launch daemons under supervision using "storm" script and a supervisor of your choice
The last step is to launch all the Storm daemons. It is critical that you run each of these daemons under supervision. Storm is a fail-fast system which means the processes will halt whenever an unexpected error is encountered. Storm is designed so that it can safely halt at any point and recover correctly when the process is restarted. This is why Storm keeps no state in-process -- if Nimbus or the Supervisors restart, the running topologies are unaffected. Here's how to run the Storm daemons:
Nimbus: Run the command "bin/storm nimbus" under supervision on the master machine.
Supervisor: Run the command "bin/storm supervisor" under supervision on each worker machine. The supervisor daemon is responsible for starting and stopping worker processes on that machine.
UI: Run the Storm UI (a site you can access from the browser that gives diagnostics on the cluster and topologies) by running the command "bin/storm ui" under supervision. The UI can be accessed by navigating your web browser to http://{nimbus host}:8080.
As you can see, running the daemons is very straightforward. The daemons will log to the logs/ directory in wherever you extracted the Storm release.
分享到:
相关推荐
storm集群安装与运维.doc
Storm集群安装部署步骤,一步一步记录了作者亲自实践部署的过程,包括遇到的错误解决办法
本文档主要介绍如何在多台服务器上部署一个完整的Storm集群,其中包括JDK安装、Zookeeper集群安装以及Storm集群的搭建。整个过程分为三个主要部分:Java环境(JDK)的安装、Zookeeper集群的配置以及Storm集群的具体...
3. 安装Storm集群涉及以下步骤: a. 首先搭建Zookeeper集群,Zookeeper是Apache的一个子项目,提供分布式协调服务,是Storm集群稳定运行的基础。 b. 在所有参与集群的节点上安装必要的依赖库,包括Java 6和Python ...
storm单本安装及集群安装配置,详情见文档,另需要注意storm.yaml的配置。
Storm集群环境搭建是指在多台服务器上安装和配置Storm集群,实现高可用和负载均衡。Storm是一个分布式实时计算系统,能够对大量数据进行实时处理和分析。本文将详细介绍Storm集群环境搭建的步骤,包括集群规划、...
storm的集群安装笔记,在我的虚拟机上安装的整个过程,所有注意点都写了。绝对好用
Storm集群的部署和配置是构建实时大数据处理系统的关键步骤,本文将详细讲解这一过程。Storm是一个分布式实时计算系统,能够处理大规模数据流,确保每个事件都得到正确的处理。以下是搭建Storm集群的详细步骤: 1. ...
### Storm集群安装与配置知识点详解 #### 一、概述 Storm是一个免费且开源的分布式实时计算系统,可以处理大量的实时数据流。它被设计用于在集群环境中运行,能够确保每个消息都被正确处理,并且能够自动地恢复任何...
今天接上文,来实现一个Storm数据流处理综合案例的第二部分,Storm集群向Kafka集群源源不断读取数据,通过MyBatis写入到MySQL数据库,并部署为远程模式 准备工作 参考上文准备工作 代码编写 思路:Storm集群从...
在安装Storm集群之前,需要确保以下环境已经准备好: 1. **操作系统**:通常推荐使用Linux系统,如CentOS或Ubuntu。 2. **Java环境**:确保已安装Java JDK,并且版本不低于1.7。 3. **ZooKeeper集群**:用于协调...
【标题】"storm集群启动与停止脚本共2页.pdf.zip" 提供的是一份关于storm集群管理的文档,主要涵盖了如何启动和停止storm集群的关键步骤。Storm是Apache开源的一个分布式实时计算系统,它能够处理大规模的数据流并...
### 从零开始搭建Storm集群 #### 一、概述 Apache Storm 是一款开源的大规模实时计算系统,类似于Hadoop处理批量数据,Storm处理的是实时数据流。它支持各种编程语言,能够实现高吞吐量、低延迟的数据处理,并且...
同时,需要在所有参与集群的服务器上安装并配置好Zookeeper,它作为协调服务,管理Storm集群的状态。 2. **下载Storm**:从Apache官网下载最新稳定的Storm版本,并解压到服务器上。确保所有服务器上的Storm版本一致...
在本压缩包“storm集群搭建Java客户端测试代码.zip”中,包含了有关Apache Storm集群的搭建教程以及使用Java客户端进行测试的代码示例。Apache Storm是一个分布式实时计算系统,它允许开发者处理无界数据流,常用于...
storm集群环境搭建文档
今天来实现一个Storm数据流处理综合案例的第一部分,Storm集群向Kafka集群源源不断写入数据,并部署为远程模式 准备工作 搭建三台Kafka集群服务器 参考文档:Linux部署Kafka集群 搭建三台Storm集群服务器 参考...