flume ng arc and configuration -

ilnba

浏览: 267249 次
性别:
来自: 苏州

最近访客更多访客>>

wscbc

tcpdump2015

lybsdu

yy629

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

flume ng arc and configuration

博客分类：

hadoop

Please ref flume user guide first

http://flume.apache.org/FlumeUserGuide.html

And the Cloudera flume blogs

http://blog.cloudera.com/blog/category/flume/

How to define JAVA_HOME, java options and add our customized lib into flume-ng.

All these information will be defined in FLUME_CONFI_DIR/flume-env.sh

Example like below.

JAVA_HOME=/opt/java

JAVA_OPTS="-Xms200m -Xmx200m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=3669 -Dflume.called.from.service"

FLUME_CLASSPATH=/opt/sponge/flume/lib/*

How start flume-ng as agent

Please note we should name the flume collector name to hostname_agent and this name will be used in the flume-conf-agent.properties

$/usr/lib/flume/bin/flume-ng agent --conf /opt/sponge/flume/config/ --conf-file /opt/sponge/flume/conf/flume-conf-agent.properties --name hostname_agent &

How to start flume-en as collector

Please note we should name the flume collector name to hostname_collector and this name will be used in the flume-conf-collector.properties

$/usr/lib/flume/bin/flume-ng agent --conf /opt/sponge/flume/config/ --conf-file /opt/sponge/flume/conf/flume-conf-collector.properties --name hostname_collector &

How to define the flume agent and flume collector property file.

I’ve already committed 2 different property files to https://svn.nam.nsroot.net:9050/svn/153299/elf/sponge-branches/2013-03-14-FlumeNG/sponge/myflumeng/config

Please ref flume-conf-agent.properties and flume-conf-collector.properties.

The basic name convention are

1)each agent name will be set as hostname_agent

2)each collector name will be set as hostname_collector

3)the source names will be source1, source2,source3…..

4)the sink name will be avroSink1, avroSink2, avroSink3….

5)each sink’s interceptor will be set as interceptor1, interceptor2, interceptor3 ….

6)all agent sinks will be AVRO sink.

7)the default collector source is AVRO source

8)agent sinks are load balanced as round robin

9)file channel is default for both agent and collector

flume-conf-agent.properties

hostname_agent.sources = source1, source2

hostname_agent.channels = fileChannel

hostname_agent.sinks = avroSink1, avroSink2

# For each one of the sources, the type is defined

hostname_agent.sources.source1.type = exec

hostname_agent.sources.source1.command = tail -F /var/log/audit/audit.log

hostname_agent.sources.source1.channels = fileChannel

hostname_agent.sources.source1.batchSize=10

hostname_agent.sources.source2.type = exec

hostname_agent.sources.source2.command = tail -F /var/log/flume/flume.log

hostname_agent.sources.source2.channels = fileChannel

hostname_agent.sources.source2.batchSize=10

# For each one of the sources, the log interceptor is defined

hostname_agent.sources.source1.interceptors = logIntercept1

hostname_agent.sources.source1.interceptors.logIntercept1.type = com.citi.sponge.flume.sink.LogInterceptor$Builder

hostname_agent.sources.source1.interceptors.logIntercept1.preserveExisting = false

hostname_agent.sources.source1.interceptors.logIntercept1.hostName = hostname

hostname_agent.sources.source1.interceptors.logIntercept1.env = PROD

hostname_agent.sources.source1.interceptors.logIntercept1.logType = AUDIT_LOG

hostname_agent.sources.source1.interceptors.logIntercept1.appId = 111111

hostname_agent.sources.source1.interceptors.logIntercept1.logFilePath = /var/log/audit

hostname_agent.sources.source1.interceptors.logIntercept1.logFileName = audit.log

hostname_agent.sources.source2.interceptors = logIntercept2

hostname_agent.sources.source2.interceptors.logIntercept2.type = com.citi.sponge.flume.sink.LogInterceptor$Builder

hostname_agent.sources.source2.interceptors.logIntercept2.preserveExisting = false

hostname_agent.sources.source2.interceptors.logIntercept2.hostName = hostname

hostname_agent.sources.source2.interceptors.logIntercept2.env = PROD

hostname_agent.sources.source2.interceptors.logIntercept2.logType = FLUME

hostname_agent.sources.source2.interceptors.logIntercept2.appId = 111111

hostname_agent.sources.source2.interceptors.logIntercept2.logFilePath = /var/log/flume

hostname_agent.sources.source2.interceptors.logIntercept2.logFileName = flume.log

#for each of the sink, type is defined

hostname_agent.sinks.avroSink1.type = avro

hostname_agent.sinks.avroSink1.hostname=collector1

hostname_agent.sinks.avroSink1.port=1442

hostname_agent.sinks.avroSink1.batchSize=10

hostname_agent.sinks.avroSink1.channel = fileChannel

hostname_agent.sinks.avroSink2.type = avro

hostname_agent.sinks.avroSink2.hostname=collector2

hostname_agent.sinks.avroSink2.port=1442

hostname_agent.sinks.avroSink2.batchSize=10

hostname_agent.sinks.avroSink2.channel = fileChannel

#Specify the load balance configurations for sinks

agent.sinkgroups = sinkGroup

agent.sinkgroups.sinkGroup.sinks = avroSink1 avroSink2

agent.sinkgroups.sinkGroup.processor.type = load_balance

agent.sinkgroups.sinkGroup.processor.backoff = true

agent.sinkgroups.sinkGroup.processor.selector = round_robin

agent.sinkgroups.sinkGroup.processor.selector.maxBackoffMillis=30000

# Each channel's type is defined.

hostname_agent.channels.fileChannel.type = file

hostname_agent.channels.fileChannel.checkpointDir = /opt/sponge/file-channel/checkpoint

hostname_agent.channels.fileChannel.dataDirs = /opt/sponge/file-channel/dataDirs

hostname_agent.channels.fileChannel.transactionCapacity = 1000

hostname_agent.channels.fileChannel.checkpointInterval = 30000

hostname_agent.channels.fileChannel.maxFileSize = 2146435071

hostname_agent.channels.fileChannel.minimumRequiredSpace = 524288000

hostname_agent.channels.fileChannel.keep-alive = 5

hostname_agent.channels.fileChannel.write-timeout = 5

hostname_agent.channels.fileChannel.checkpoint-timeout = 600

flume-collector.properties

hostname_collector.sources = source

hostname_collector.channels = fileChannel

hostname_collector.sinks = hbaseSink

# For each one of the sources, the type is defined

hostname_collector.sources.avroSource.channels = fileChannel

hostname_collector.sources.avroSource.type = avro

hostname_collector.sources.avroSource.bind = hostname

hostname_collector.sources.avroSource.port = 1442

hostname_collector.sinks.hbaseSink.type=org.apache.flume.sink.hbase.HBaseSink

hostname_collector.sinks.hbaseSink.table=spong_flumeng_log2

hostname_collector.sinks.hbaseSink.columnFamily=content

hostname_collector.sinks.hbaseSink.serializer=com.citi.sponge.flume.sink.LogHbaseEventSerializer

hostname_collector.sinks.hbaseSink.timeout=120

hostname_collector.sinks.hbaseSink.column=log

hostname_collector.sinks.hbaseSink.batchSize=2

hostname_collector.sinks.hbaseSink.channel=fileChannel

# Each channel's type is defined.

hostname_collector.channels.fileChannel.type = file

hostname_collector.channels.fileChannel.checkpointDir = /opt/sponge/file-channel/checkpoint

hostname_collector.channels.fileChannel.dataDirs = /opt/sponge/file-channel/dataDirs

hostname_collector.channels.fileChannel.transactionCapacity = 1000

hostname_collector.channels.fileChannel.checkpointInterval = 30000

hostname_collector.channels.fileChannel.maxFileSize = 2146435071

hostname_collector.channels.fileChannel.minimumRequiredSpace = 524288000

hostname_collector.channels.fileChannel.keep-alive = 5

hostname_collector.channels.fileChannel.write-timeout = 5

hostname_collector.channels.fileChannel.checkpoint-timeout = 600

分享到：

flume ng performance tuning | Hadoop best practise

2013-03-22 16:11
浏览 2975
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

flume ng arc and configuration

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

flume ng arc and configuration

评论

发表评论

相关推荐

flumeng hbase jmx integration

Hbase MapReduce Integration

flume ng performance tuning

Hadoop best practise

Clouder SCM Manager FAQ

最近访客更多访客>>