- 浏览: 218955 次
- 性别:
- 来自: 北京
最新评论
-
javalogo:
[b][i][u]引用[list]
[*][*][flash= ...
什么是Flume -
leibnitz:
what are they meanings
Hadoop Ganglia Metric Item -
di1984HIT:
没用过啊。
akka 介绍-Actor 基础 -
di1984HIT:
写的不错。
Hadoop管理-集群维护 -
developerinit:
很好,基本上介绍了
什么是Flume
文章列表
@爱摩王涛:数据的力量,未来商业的制高点 ,基础是云计算。//@数据化管理:「从商业智能到消费智能」在商业智能时代企业收集各类数据支持自己的决策。而在消费智能时代,数据分析业务将作为一项服务由企业提供给消费者,支持他们自己的消费决策。银行帐单分析就是这种思路。B2C网站也可以提供消费者个体的购买行为分析给消费者,让他们自己决策。http://t.cn/zOga2xj
从企业向个人用户转换的决策支撑-大数据分析平台
http://www.cloudera.com/blog/2012/09/what-do-real-life-hadoop-workloads-look-like/
CDH4 HA 切换时间
- 博客分类:
- hadoop
blocksize:35M
filesize 96M
zk-session-timeout:10s
logs:
active nn:Wed Sep 5 13:20:25 CST 2012
zk:
[zk: localhost:2181(CONNECTED) 19] get /hadoop-ha/mycluster/ActiveStandbyElectorLock
myclusternn1bd10 \ufffdF(\ufffd>
cZxid = 0xd90
ctime = Wed Sep 05 13:20:58 CST 2012
mZxid = 0xd90
mtime = W ...
CDH4 HA 切换
- 博客分类:
- hadoop
HA 切换问题
切换时间太长。。。
copy 0 ...
Wed Sep 5 10:30:01 CST 2012
copy 1 ...
Wed Sep 5 10:30:18 CST 2012
copy 2 ...
Wed Sep 5 10:30:57 CST 2012
12/09/05 10:47:24 WARN retry.RetryInvocationHandler: Exception while invoking addBlock of class ClientNamenodeProtocolTranslatorPB. Trying to fail over immediat ...
根据日志:
StandBy NN启动过程
1.获得Active NN Checkpoints信息
2.在内存中,注册Live Nodes
3.SB NN 进入Safe Mode
4.从Datanod获取包信息
5.离开Safe Mode
Checkpointing active NN at bigdata-4:50070
Serving checkpoints at bigdata-3/172.16.206.206:50070
2012-08-02 11:07:24,761 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.r ...
环境:
写入数据时,active node被kill掉
分析:
与Active连接断开,Active没有返回Response,此异常,需要捕获并处理,可以添加休眠,以便Standby切换成 Active
日志:
2012-08-02 10:50:28,961 WARN ipc.Client (Client.java:run(787)) - Unexpected error reading responses on connection Thread[IPC Client (591210723) connection to bigdata-4/172.16.206 ...
CDH4 HA test
- 博客分类:
- hadoop
场景:
NN HA 设置成功,HA切换客户端出现异常,
错误分析
用户执行Shell脚本问题
日志:
客户端
2012-08-01 14:37:07,798 WARN ipc.Client (Client.java:run(787)) - Unexpected error reading responses on connection Thread[IPC Client (1333933549) connection to bigdata-3/172.16.206.206:9000 from peter,5,main]
java.lang.NullPointerEx ...
Hadoop TextOutput
- 博客分类:
- hadoop
TextOutputFormat
分隔符参数:
mapreduce.output.textoutputformat.separator
StreamXmlRecordReader
设置属性
stream.recordreader.class=org.apache.hadoop.streaming.StreamXmlRecordReader
详情参考http://mahout.apache.org/ XMLInputFormat
NLineInputFormat
重写了splits
设置参数
mapre duce.input.lineinputformat.linespermap
应用场景
如创建了一个数据源文件,每个Map处理一行,连接不同的数据库
Reduce数量设置成0,是一个Map Only任务
key/value 分割符
mapreduce.input.keyvaluelinerecordreader.key.value.separator
Hadoop 控制split尺寸
- 博客分类:
- hadoop
三个参数决定Map的Split尺寸
1.mapred.min.split.size
2.mapred.max.split.size
3.dfs.block.size
根据公式:
max(minimumSize,min(maximumSize,blockSize))
默认情况:
minimumSize < blockSize < maximumSize
例子:
min max block split
1M 100M 64M 64M
128M 512M 64 ...
Setting up Disks for Hadoop
Here are some recommendations for setting up disks in a Hadoop cluster. What we have here is anecdotal -hard evidence is very welcome, and everyone should expect a bit of trial and error work.
Key Points
Goals for a Hadoop cluster are normally massive amounts of data wi ...
Compatibility
When moving from one release to another
you need to consider the upgrade steps that are needed consider.
1.API compatibility
2.Data compatibility
3.Wire compatibility
http://hadoop.apache.org/common/docs/r0.23.0/hadoop-project-dist/hadoop-common/DeprecatedProperties.html