http://xumingming.sinaapp.com/185/twitter-storm-%E5%9C%A8%E7%94%9F%E4%BA%A7%E9%9B%86%E7%BE%A4%E4%B8%8A%E8%BF%90%E8%A1%8Ctopology/
public static java.lang.String STORM_ZOOKEEPER_SERVERS
A list of hosts of ZooKeeper servers used to manage the cluster.
STORM_ZOOKEEPER_PORT
public static java.lang.String STORM_ZOOKEEPER_PORT
The port Storm will use to connect to each of the ZooKeeper servers.
STORM_LOCAL_DIR
public static java.lang.String STORM_LOCAL_DIR
A directory on the local filesystem used by Storm for any local filesystem usage it needs. The directory must exist and the Storm daemons must have permission to read/write from this location.
STORM_SCHEDULER
public static java.lang.String STORM_SCHEDULER
A global task scheduler used to assign topologies's tasks to supervisors' wokers. If this is not set, a default system scheduler will be used.
STORM_CLUSTER_MODE
public static java.lang.String STORM_CLUSTER_MODE
The mode this Storm cluster is running in. Either "distributed" or "local".
STORM_LOCAL_HOSTNAME
public static java.lang.String STORM_LOCAL_HOSTNAME
The hostname the supervisors/workers should report to nimbus. If unset, Storm will get the hostname to report by calling InetAddress.getLocalHost().getCanonicalHostName()
. You should set this config when you dont have a DNS which supervisors/workers can utilize to find each other based on hostname got from calls toInetAddress.getLocalHost().getCanonicalHostName()
.
STORM_LOCAL_MODE_ZMQ
public static java.lang.String STORM_LOCAL_MODE_ZMQ
Whether or not to use ZeroMQ for messaging in local mode. If this is set to false, then Storm will use a pure-Java messaging system. The purpose of this flag is to make it easy to run Storm in local mode by eliminating the need for native dependencies, which can be difficult to install. Defaults to false.
STORM_ZOOKEEPER_ROOT
public static java.lang.String STORM_ZOOKEEPER_ROOT
The root location at which Storm stores data in ZooKeeper.
STORM_ZOOKEEPER_SESSION_TIMEOUT
public static java.lang.String STORM_ZOOKEEPER_SESSION_TIMEOUT
The session timeout for clients to ZooKeeper.
STORM_ZOOKEEPER_CONNECTION_TIMEOUT
public static java.lang.String STORM_ZOOKEEPER_CONNECTION_TIMEOUT
The connection timeout for clients to ZooKeeper.
STORM_ZOOKEEPER_RETRY_TIMES
public static java.lang.String STORM_ZOOKEEPER_RETRY_TIMES
The number of times to retry a Zookeeper operation.
STORM_ZOOKEEPER_RETRY_INTERVAL
public static java.lang.String STORM_ZOOKEEPER_RETRY_INTERVAL
The interval between retries of a Zookeeper operation.
STORM_ZOOKEEPER_AUTH_SCHEME
public static java.lang.String STORM_ZOOKEEPER_AUTH_SCHEME
The Zookeeper authentication scheme to use, e.g. "digest". Defaults to no authentication.
STORM_ZOOKEEPER_AUTH_PAYLOAD
public static java.lang.String STORM_ZOOKEEPER_AUTH_PAYLOAD
A string representing the payload for Zookeeper authentication. It gets serialized using UTF-8 encoding during authentication.
STORM_ID
public static java.lang.String STORM_ID
The id assigned to a running topology. The id is the storm name with a unique nonce appended.
NIMBUS_HOST
public static java.lang.String NIMBUS_HOST
The host that the master server is running on.
NIMBUS_THRIFT_PORT
public static java.lang.String NIMBUS_THRIFT_PORT
Which port the Thrift interface of Nimbus should run on. Clients should connect to this port to upload jars and submit topologies.
NIMBUS_CHILDOPTS
public static java.lang.String NIMBUS_CHILDOPTS
This parameter is used by the storm-deploy project to configure the jvm options for the nimbus daemon.
NIMBUS_TASK_TIMEOUT_SECS
public static java.lang.String NIMBUS_TASK_TIMEOUT_SECS
How long without heartbeating a task can go before nimbus will consider the task dead and reassign it to another location.
NIMBUS_MONITOR_FREQ_SECS
public static java.lang.String NIMBUS_MONITOR_FREQ_SECS
How often nimbus should wake up to check heartbeats and do reassignments. Note that if a machine ever goes down Nimbus will immediately wake up and take action. This parameter is for checking for failures when there's no explicit event like that occuring.
NIMBUS_CLEANUP_INBOX_FREQ_SECS
public static java.lang.String NIMBUS_CLEANUP_INBOX_FREQ_SECS
How often nimbus should wake the cleanup thread to clean the inbox.
See Also:
NIMBUS_INBOX_JAR_EXPIRATION_SECS
NIMBUS_INBOX_JAR_EXPIRATION_SECS
public static java.lang.String NIMBUS_INBOX_JAR_EXPIRATION_SECS
The length of time a jar file lives in the inbox before being deleted by the cleanup thread. Probably keep this value greater than or equal to NIMBUS_CLEANUP_INBOX_JAR_EXPIRATION_SECS. Note that the time it takes to delete an inbox jar file is going to be somewhat more than NIMBUS_CLEANUP_INBOX_JAR_EXPIRATION_SECS (depending on how often NIMBUS_CLEANUP_FREQ_SECS is set to).
See Also:
NIMBUS_CLEANUP_FREQ_SECS
NIMBUS_SUPERVISOR_TIMEOUT_SECS
public static java.lang.String NIMBUS_SUPERVISOR_TIMEOUT_SECS
How long before a supervisor can go without heartbeating before nimbus considers it dead and stops assigning new work to it.
NIMBUS_TASK_LAUNCH_SECS
public static java.lang.String NIMBUS_TASK_LAUNCH_SECS
A special timeout used when a task is initially launched. During launch, this is the timeout used until the first heartbeat, overriding nimbus.task.timeout.secs.
A separate timeout exists for launch because there can be quite a bit of overhead to launching new JVM's and configuring them.
NIMBUS_REASSIGN
public static java.lang.String NIMBUS_REASSIGN
Whether or not nimbus should reassign tasks if it detects that a task goes down. Defaults to true, and it's not recommended to change this value.
NIMBUS_FILE_COPY_EXPIRATION_SECS
public static java.lang.String NIMBUS_FILE_COPY_EXPIRATION_SECS
During upload/download with the master, how long an upload or download connection is idle before nimbus considers it dead and drops the connection.
UI_PORT
public static java.lang.String UI_PORT
Storm UI binds to this port.
UI_CHILDOPTS
public static java.lang.String UI_CHILDOPTS
Childopts for Storm UI Java process.
DRPC_SERVERS
public static java.lang.String DRPC_SERVERS
List of DRPC servers so that the DRPCSpout knows who to talk to.
DRPC_PORT
public static java.lang.String DRPC_PORT
This port is used by Storm DRPC for receiving DPRC requests from clients.
DRPC_INVOCATIONS_PORT
public static java.lang.String DRPC_INVOCATIONS_PORT
This port on Storm DRPC is used by DRPC topologies to receive function invocations and send results back.
DRPC_REQUEST_TIMEOUT_SECS
public static java.lang.String DRPC_REQUEST_TIMEOUT_SECS
The timeout on DRPC requests within the DRPC server. Defaults to 10 minutes. Note that requests can also timeout based on the socket timeout on the DRPC client, and separately based on the topology message timeout for the topology implementing the DRPC function.
SUPERVISOR_SCHEDULER_META
public static java.lang.String SUPERVISOR_SCHEDULER_META
the metadata configed on the supervisor
SUPERVISOR_SLOTS_PORTS
public static java.lang.String SUPERVISOR_SLOTS_PORTS
A list of ports that can run workers on this supervisor. Each worker uses one port, and the supervisor will only run one worker per port. Use this configuration to tune how many workers run on each machine.
SUPERVISOR_CHILDOPTS
public static java.lang.String SUPERVISOR_CHILDOPTS
This parameter is used by the storm-deploy project to configure the jvm options for the supervisor daemon.
SUPERVISOR_WORKER_TIMEOUT_SECS
public static java.lang.String SUPERVISOR_WORKER_TIMEOUT_SECS
How long a worker can go without heartbeating before the supervisor tries to restart the worker process.
SUPERVISOR_WORKER_START_TIMEOUT_SECS
public static java.lang.String SUPERVISOR_WORKER_START_TIMEOUT_SECS
How long a worker can go without heartbeating during the initial launch before the supervisor tries to restart the worker process. This value override supervisor.worker.timeout.secs during launch because there is additional overhead to starting and configuring the JVM on launch.
SUPERVISOR_ENABLE
public static java.lang.String SUPERVISOR_ENABLE
Whether or not the supervisor should launch workers assigned to it. Defaults to true -- and you should probably never change this value. This configuration is used in the Storm unit tests.
SUPERVISOR_HEARTBEAT_FREQUENCY_SECS
public static java.lang.String SUPERVISOR_HEARTBEAT_FREQUENCY_SECS
how often the supervisor sends a heartbeat to the master.
SUPERVISOR_MONITOR_FREQUENCY_SECS
public static java.lang.String SUPERVISOR_MONITOR_FREQUENCY_SECS
How often the supervisor checks the worker heartbeats to see if any of them need to be restarted.
WORKER_CHILDOPTS
public static java.lang.String WORKER_CHILDOPTS
The jvm opts provided to workers launched by this supervisor. All "%ID%" substrings are replaced with an identifier for this worker.
WORKER_HEARTBEAT_FREQUENCY_SECS
public static java.lang.String WORKER_HEARTBEAT_FREQUENCY_SECS
How often this worker should heartbeat to the supervisor.
TASK_HEARTBEAT_FREQUENCY_SECS
public static java.lang.String TASK_HEARTBEAT_FREQUENCY_SECS
How often a task should heartbeat its status to the master.
TASK_REFRESH_POLL_SECS
public static java.lang.String TASK_REFRESH_POLL_SECS
How often a task should sync its connections with other tasks (if a task is reassigned, the other tasks sending messages to it need to refresh their connections). In general though, when a reassignment happens other tasks will be notified almost immediately. This configuration is here just in case that notification doesn't come through.
TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS
public static java.lang.String TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS
True if Storm should timeout messages or not. Defaults to true. This is meant to be used in unit tests to prevent tuples from being accidentally timed out during the test.
TOPOLOGY_DEBUG
public static java.lang.String TOPOLOGY_DEBUG
When set to true, Storm will log every message that's emitted.
TOPOLOGY_OPTIMIZE
public static java.lang.String TOPOLOGY_OPTIMIZE
Whether or not the master should optimize topologies by running multiple tasks in a single thread where appropriate.
TOPOLOGY_WORKERS
public static java.lang.String TOPOLOGY_WORKERS
How many processes should be spawned around the cluster to execute this topology. Each process will execute some number of tasks as threads within them. This parameter should be used in conjunction with the parallelism hints on each component in the topology to tune the performance of a topology.
TOPOLOGY_TASKS
public static java.lang.String TOPOLOGY_TASKS
How many instances to create for a spout/bolt. A task runs on a thread with zero or more other tasks for the same spout/bolt. The number of tasks for a spout/bolt is always the same throughout the lifetime of a topology, but the number of executors (threads) for a spout/bolt can change over time. This allows a topology to scale to more or less resources without redeploying the topology or violating the constraints of Storm (such as a fields grouping guaranteeing that the same value goes to the same task).
TOPOLOGY_ACKER_EXECUTORS
public static java.lang.String TOPOLOGY_ACKER_EXECUTORS
How many executors to spawn for ackers.
If this is set to 0, then Storm will immediately ack tuples as soon as they come off the spout, effectively disabling reliability.
TOPOLOGY_MESSAGE_TIMEOUT_SECS
public static java.lang.String TOPOLOGY_MESSAGE_TIMEOUT_SECS
The maximum amount of time given to the topology to fully process a message emitted by a spout. If the message is not acked within this time frame, Storm will fail the message on the spout. Some spouts implementations will then replay the message at a later time.
TOPOLOGY_KRYO_REGISTER
public static java.lang.String TOPOLOGY_KRYO_REGISTER
A list of serialization registrations for Kryo ( http://code.google.com/p/kryo/ ), the underlying serialization framework for Storm. A serialization can either be the name of a class (in which case Kryo will automatically create a serializer for the class that saves all the object's fields), or an implementation of com.esotericsoftware.kryo.Serializer. See Kryo's documentation for more information about writing custom serializers.
TOPOLOGY_KRYO_DECORATORS
public static java.lang.String TOPOLOGY_KRYO_DECORATORS
A list of classes that customize storm's kryo instance during start-up. Each listed class name must implement IKryoDecorator. During start-up the listed class is instantiated with 0 arguments, then its 'decorate' method is called with storm's kryo instance as the only argument.
TOPOLOGY_SKIP_MISSING_KRYO_REGISTRATIONS
public static java.lang.String TOPOLOGY_SKIP_MISSING_KRYO_REGISTRATIONS
Whether or not Storm should skip the loading of kryo registrations for which it does not know the class or have the serializer implementation. Otherwise, the task will fail to load and will throw an error at runtime. The use case of this is if you want to declare your serializations on the storm.yaml files on the cluster rather than every single time you submit a topology. Different applications may use different serializations and so a single application may not have the code for the other serializers used by other apps. By setting this config to true, Storm will ignore that it doesn't have those other serializations rather than throw an error.
TOPOLOGY_MAX_TASK_PARALLELISM
public static java.lang.String TOPOLOGY_MAX_TASK_PARALLELISM
The maximum parallelism allowed for a component in this topology. This configuration is typically used in testing to limit the number of threads spawned in local mode.
TOPOLOGY_MAX_SPOUT_PENDING
public static java.lang.String TOPOLOGY_MAX_SPOUT_PENDING
The maximum number of tuples that can be pending on a spout task at any given time. This config applies to individual tasks, not to spouts or topologies as a whole. A pending tuple is one that has been emitted from a spout but has not been acked or failed yet. Note that this config parameter has no effect for unreliable spouts that don't tag their tuples with a message id.
TOPOLOGY_SPOUT_WAIT_STRATEGY
public static java.lang.String TOPOLOGY_SPOUT_WAIT_STRATEGY
A class that implements a strategy for what to do when a spout needs to wait. Waiting is triggered in one of two conditions: 1. nextTuple emits no tuples 2. The spout has hit maxSpoutPending and can't emit any more tuples
TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS
public static java.lang.String TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS
The amount of milliseconds the SleepEmptyEmitStrategy should sleep for.
TOPOLOGY_STATE_SYNCHRONIZATION_TIMEOUT_SECS
public static java.lang.String TOPOLOGY_STATE_SYNCHRONIZATION_TIMEOUT_SECS
The maximum amount of time a component gives a source of state to synchronize before it requests synchronization again.
TOPOLOGY_STATS_SAMPLE_RATE
public static java.lang.String TOPOLOGY_STATS_SAMPLE_RATE
The percentage of tuples to sample to produce stats for a task.
TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION
public static java.lang.String TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION
Whether or not to use Java serialization in a topology.
TOPOLOGY_WORKER_CHILDOPTS
public static java.lang.String TOPOLOGY_WORKER_CHILDOPTS
Topology-specific options for the worker child process. This is used in addition to WORKER_CHILDOPTS.
TOPOLOGY_TRANSACTIONAL_ID
public static java.lang.String TOPOLOGY_TRANSACTIONAL_ID
This config is available for TransactionalSpouts, and contains the id ( a String) for the transactional topology. This id is used to store the state of the transactional topology in Zookeeper.
TOPOLOGY_AUTO_TASK_HOOKS
public static java.lang.String TOPOLOGY_AUTO_TASK_HOOKS
A list of task hooks that are automatically added to every spout and bolt in the topology. An example of when you'd do this is to add a hook that integrates with your internal monitoring system. These hooks are instantiated using the zero-arg constructor.
TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE
public static java.lang.String TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE
The size of the Disruptor receive queue for each executor. Must be a power of 2.
TOPOLOGY_RECEIVER_BUFFER_SIZE
public static java.lang.String TOPOLOGY_RECEIVER_BUFFER_SIZE
The maximum number of messages to batch from the thread receiving off the network to the executor queues. Must be a power of 2.
TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE
public static java.lang.String TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE
The size of the Disruptor send queue for each executor. Must be a power of 2.
TOPOLOGY_TRANSFER_BUFFER_SIZE
public static java.lang.String TOPOLOGY_TRANSFER_BUFFER_SIZE
The size of the Disruptor transfer queue for each worker.
TOPOLOGY_TICK_TUPLE_FREQ_SECS
public static java.lang.String TOPOLOGY_TICK_TUPLE_FREQ_SECS
How often a tick tuple from the "__system" component and "__tick" stream should be sent to tasks. Meant to be used as a component-specific configuration.
TOPOLOGY_DISRUPTOR_WAIT_STRATEGY
public static java.lang.String TOPOLOGY_DISRUPTOR_WAIT_STRATEGY
Configure the wait strategy used for internal queuing. Can be used to tradeoff latency vs. throughput
TOPOLOGY_WORKER_SHARED_THREAD_POOL_SIZE
public static java.lang.String TOPOLOGY_WORKER_SHARED_THREAD_POOL_SIZE
The size of the shared thread pool for worker tasks to make use of. The thread pool can be accessed via the TopologyContext.
TOPOLOGY_NAME
public static java.lang.String TOPOLOGY_NAME
Name of the topology. This config is automatically set by Storm when the topology is submitted.
TRANSACTIONAL_ZOOKEEPER_ROOT
public static java.lang.String TRANSACTIONAL_ZOOKEEPER_ROOT
The root directory in ZooKeeper for metadata about TransactionalSpouts.
TRANSACTIONAL_ZOOKEEPER_SERVERS
public static java.lang.String TRANSACTIONAL_ZOOKEEPER_SERVERS
The list of zookeeper servers in which to keep the transactional state. If null (which is default), will use storm.zookeeper.servers
TRANSACTIONAL_ZOOKEEPER_PORT
public static java.lang.String TRANSACTIONAL_ZOOKEEPER_PORT
The port to use to connect to the transactional zookeeper servers. If null (which is default), will use storm.zookeeper.port
ZMQ_THREADS
public static java.lang.String ZMQ_THREADS
The number of threads that should be used by the zeromq context in each worker process.
ZMQ_LINGER_MILLIS
public static java.lang.String ZMQ_LINGER_MILLIS
How long a connection should retry sending messages to a target host when the connection is closed. This is an advanced configuration and can almost certainly be ignored.
JAVA_LIBRARY_PATH
public static java.lang.String JAVA_LIBRARY_PATH
This value is passed to spawned JVMs (e.g., Nimbus, Supervisor, and Workers) for the java.library.path value. java.library.path tells the JVM where to look for native libraries. It is necessary to set this config correctly since Storm uses the ZeroMQ and JZMQ native libs.
DEV_ZOOKEEPER_PATH
public static java.lang.String DEV_ZOOKEEPER_PATH
The path to use as the zookeeper dir when running a zookeeper server via "storm dev-zookeeper". This zookeeper instance is only intended for development; it is not a production grade zookeeper setup.
Config
public Config()
相关推荐
【标题】"storm提交topology的过程"涉及到的是Apache Storm这一分布式实时计算系统中的核心操作——部署和运行流处理任务,即topology。Apache Storm被广泛应用于实时数据处理、在线机器学习、持续计算以及大规模...
如果运行在不同的配置上,则必须修改文件bin/deploy-topology.sh和docker-config.yml以分别指定正确的Nimbus主机和Kafka主机。 建造 该示例取决于SLF4J,Zookeeper,Kafka和Storm。 这些软件包的目录在build....
Apache Storm的核心概念包括:拓扑(Topology)、工作者(Worker)、节点(Spout)和 bolt(Bolt)。拓扑是 Storm 应用的基本结构,由多个节点和bolt组成,它们之间通过流(Stream)进行连接。节点负责产生数据流(Spout),而...
此外,`storm-core`模块中的`backtype.storm.util`提供了各种工具类,如`AsyncLoopRunnable`用于异步执行任务,`Config`类则包含了Storm的配置选项。`storm-client`模块则包含了提交拓扑到集群的相关代码,如`Nimbus...
7. **config目录**:可能包含默认配置文件,如`storm.yaml`,用于配置Storm集群的参数。 在使用storm0.9.0jar包进行开发时,开发者需要理解以下关键概念: - **Topology**:这是Storm的基本计算单元,由多个Bolt和...
1. **配置Storm拓扑结构**:在Storm中定义Topology,该Topology包含Spout和Bolt组件。Spout用于读取Kafka中的数据,Bolt则负责数据处理任务。 2. **启动Storm**:通过命令行启动Storm的Topology,使其实时监听Kafka...
- **2.1.5 拓扑(Topology)** - 拓扑是指一组相互连接的 Spouts 和 Bolts 的集合,它们共同完成一项计算任务。 - **2.1.6 主控节点与工作节点** - 主控节点 Nimbus 负责管理集群状态并分配任务;工作节点 ...
Topology是Storm的核心组件之一,它定义了整个计算逻辑。一个Topology是由一系列Spouts和Bolts以及它们之间的连接所组成的有向无环图(DAG)。Topology一旦提交到集群中运行,就会一直执行直到被显式地停止。 #### 2....
5. **Config**:配置类,用于设置Storm集群的参数,如超时时间、任务并行度等。 三、提交与管理Topologies 1. **StormSubmitter**:负责将构建好的Topology提交到Storm集群上运行。 2. **LocalCluster**:本地测试...
Storm 的核心组件包括:**Spouts**、**Bolts** 和 **Topology**。 1. **Spouts**:负责接收数据并将数据发送到Storm集群中,可以看作是数据流的源头。 2. **Bolts**:负责处理由Spouts或其它Bolts发送过来的数据,...
其核心概念包括:Topology(拓扑结构),定义了数据流的处理逻辑;Spout(数据源),负责生成数据流;Bolt(操作节点),执行具体的数据处理任务。 要搭建这三者,首先确保安装了Java环境。然后,分别下载并解压...
- **` storm-client`**: 包含了与Storm集群交互所需的客户端库,如`Config`类用于设置拓扑配置,`StormSubmitter`用于提交拓扑到集群。 - **` topology-api`**: 提供了编写拓扑的高级API,如`IRichSpout`和`...
注:timetout可以通过Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS 来指定Storm中的每一个Topology中都包含有一个Acker组件。Acker组件的任务就是跟踪从某个task中的Spout流出的每一个messageId所绑定的Tupl
cluster.submitTopology(TOPOLOGY_NAME, config, builder.createTopology()); 等待秒(10); cluster.killTopology(TOPOLOGY_NAME); cluster.shutdown(); checkins.txt 和 log4j 存在于 maven 项目的资源文件夹中。...
使用`Config`类可以配置Topology的运行参数,如worker数量、executor线程数、心跳间隔等。`StormSubmitter`类则提供了提交Topology到集群的接口,确保Topology能在JStorm集群上正常运行。 7. **监控与管理** ...
storm jar SCBService.jar cn.com.cintel.scb.topology.SCBTopology scbtest ``` #### 六、Flume 部署 Flume的部署相对简单,主要是解压安装包并根据实际需求配置Agent。Flume主要用于收集、聚合和移动大量日志...
- `config`:包含配置文件,如storm.yaml,用于配置拓扑的运行参数。 - `resources`:可能包含依赖的资源文件,如日志配置、第三方库等。 - `pom.xml`:Maven项目的配置文件,定义了项目依赖和构建过程。 学习这个...
- **描述**:这些命令分别用于启动Storm的Nimbus、Supervisor服务、UI界面以及提交Storm Topology。 - **应用场景**:适用于构建实时流处理应用,如数据分析、日志处理等。 2. **启动HBase Shell**: ```bash ...
JStorm是中国淘宝团队开发的一款分布式实时计算系统,它是基于Apache Storm的设计理念,但在性能、稳定性和易用性上进行了大量的优化。本文档将详细介绍JStorm的基础知识,包括其核心概念、工作原理、安装配置以及...