Spark源码分析7-Metrics的分析 -

frankfan915

浏览: 356572 次
性别:
来自: 杭州

最近访客更多访客>>

gaojingsong

javacoo

449582981

nick_jian

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

Spark源码分析7-Metrics的分析

博客分类：

源码分析
Spark

spark用metrics-core这个jar包来做spark 各个部件metrics的管理

Metrics.properties.template文件是用来配置metrics的，metrics的配置分为两部分，一是source，二是sink。有些类似于flume的source和sink的概念。Source用来收集work，master，deriver，executor等的信息。Source有ApplicationSource，BlockManagerSource，DAGSchedulerSource，ExecutorSource，JvmSource，MasterSource，WorkerSource这几类。我们可以将spark的状态信息发送到graphite,当spark异常时，用seyren发告警。以下是Metrics.properties.template的配置说明

# syntax: [instance].sink|source.[name].[options]=[value]

# This file configures Spark's internal metrics system. The metrics system is
# divided into instances which correspond to internal components.
# Each instance can be configured to report its metrics to one or more sinks.
# Accepted values for [instance] are "master", "worker", "executor", "driver",
# and "applications". A wild card "*" can be used as an instance name, in
# which case all instances will inherit the supplied property.
#
# Within an instance, a "source" specifies a particular set of grouped metrics.
# there are two kinds of sources:
# 1. Spark internal sources, like MasterSource, WorkerSource, etc, which will
# collect a Spark component's internal state. Each instance is paired with a
# Spark source that is added automatically.
# 2. Common sources, like JvmSource, which will collect low level state.
# These can be added through configuration options and are then loaded
# using reflection.
#
# A "sink" specifies where metrics are delivered to. Each instance can be
# assigned one or more sinks.
#
# The sink|source field specifies whether the property relates to a sink or
# source.
#
# The [name] field specifies the name of source or sink.
#
# The [options] field is the specific property of this source or sink. The
# source or sink is responsible for parsing this property.
#
# Notes:
# 1. To add a new sink, set the "class" option to a fully qualified class
# name (see examples below).
# 2. Some sinks involve a polling period. The minimum allowed polling period
# is 1 second.
# 3. Wild card properties can be overridden by more specific properties.
# For example, master.sink.console.period takes precedence over
# *.sink.console.period.
# 4. A metrics specific configuration
# "spark.metrics.conf=${SPARK_HOME}/conf/metrics.properties" should be
# added to Java properties using -Dspark.metrics.conf=xxx if you want to
# customize metrics system. You can also put the file in ${SPARK_HOME}/conf
# and it will be loaded automatically.
# 5. MetricsServlet is added by default as a sink in master, worker and client
# driver, you can send http request "/metrics/json" to get a snapshot of all the
# registered metrics in json format. For master, requests "/metrics/master/json" and
# "/metrics/applications/json" can be sent seperately to get metrics snapshot of
# instance master and applications. MetricsServlet may not be configured by self.
#

## List of available sinks and their properties.

# org.apache.spark.metrics.sink.ConsoleSink
# Name: Default: Description:
# period 10 Poll period
# unit seconds Units of poll period

# org.apache.spark.metrics.sink.CSVSink
# Name: Default: Description:
# period 10 Poll period
# unit seconds Units of poll period
# directory /tmp Where to store CSV files

# org.apache.spark.metrics.sink.GangliaSink
# Name: Default: Description:
# host NONE Hostname or multicast group of Ganglia server
# port NONE Port of Ganglia server(s)
# period 10 Poll period
# unit seconds Units of poll period
# ttl 1 TTL of messages sent by Ganglia
# mode multicast Ganglia network mode ('unicast' or 'mulitcast')

# org.apache.spark.metrics.sink.JmxSink

# org.apache.spark.metrics.sink.MetricsServlet
# Name: Default: Description:
# path VARIES* Path prefix from the web server root
# sample false Whether to show entire set of samples for histograms ('false' or 'true')
#
# * Default path is /metrics/json for all instances except the master. The master has two paths:
# /metrics/aplications/json # App information
# /metrics/master/json # Master information

# org.apache.spark.metrics.sink.GraphiteSink
# Name: Default: Description:
# host NONE Hostname of Graphite server
# port NONE Port of Graphite server
# period 10 Poll period
# unit seconds Units of poll period
# prefix EMPTY STRING Prefix to prepend to metric name

## Examples
# Enable JmxSink for all instances by class name
#*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink

# Enable ConsoleSink for all instances by class name
#*.sink.console.class=org.apache.spark.metrics.sink.ConsoleSink

# Polling period for ConsoleSink
#*.sink.console.period=10

#*.sink.console.unit=seconds

# Master instance overlap polling period
#master.sink.console.period=15

#master.sink.console.unit=seconds

# Enable CsvSink for all instances
#*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink

# Polling period for CsvSink
#*.sink.csv.period=1

#*.sink.csv.unit=minutes

# Polling directory for CsvSink
#*.sink.csv.directory=/tmp/

# Worker instance overlap polling period
#worker.sink.csv.period=10

#worker.sink.csv.unit=minutes

# Enable jvm source for instance master, worker, driver and executor
#master.source.jvm.class=org.apache.spark.metrics.source.JvmSource

#worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource

#driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource

#executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource

分享到：

Spark源码分析8-client 如何选择将task提 ... | Spark源码分析6-Worker

2014-05-08 11:37
浏览 2083
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Spark源码分析7-Metrics的分析

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Spark源码分析7-Metrics的分析

评论

发表评论

相关推荐

concurrent- LinkedBlockingQueue

flume源码分析-Sink

flume源码分析-SinkProcessor

flume源码分析-ChannelSelector

Spark源码分析13-Tuning Spark

Spark源码分析12-yarn部署

Spark源码分析11-BlockManager

Spark源码分析10-Schedualer

Spark源码分析9-Excutor

Spark源码分析8-client 如何选择将task提交给那个excutor

Spark源码分析6-Worker

Spark源码分析5-Master

Spark源码分析4-RDD computor

Spark源码分析3-The connect between driver,master and excutor

Spark源码分析2-Driver generate jobs and launch task

Spark源码分析1-部署与整体架构

Dubbo源代碼分析-configuration

最近访客更多访客>>