hbase standalone install

博客分类：

hbase

refer to 0.20.6 version yes ,as hbase will use Local fs to access so it's needless to start hadoop (which is run in any mode). do it like this simplely: start-hbase.sh check success or not: a)jps HMaster //hbase master server HQuorumPeer //zookeeper instan ...

2011-05-18 01:04
浏览 1085
评论(0)
分类:非技术

some features of hadoop-0.20.2

博客分类：

hadoop

Hadoop 网络应用数据结构 OS

some hdfs features of hadoop-0.20.2 高度容错性 a.块复制时不需要全部复制成功才算成功，它有个阀值控制，只要达到就算成功。之后NN会检查是否有不满足replication num要求的blocks并进行处理。(相信这只是其中之一） b.使用skipping模式，更灵活控制。高可靠性 a.采用redundancy机制，而且分布在不同的racks上。 b.很多情况下，出现异常时并非立即退出，而是采用retry机制(因为通常异常来源于rpc通信），同样有次数控制。自动修正能力 a.使用推 ...

2011-05-04 15:07
浏览 812
评论(0)
分类:非技术

sources study-part 6- remote debugging

博客分类：

hadoop sources reading

Hadoop Socket

when run in cluster(pseudo),it is difficult to debug the code.i have tried to debug use some tricks to do it: a use the jdpa debug platform. add this options to hadoop opts after start and before to run a job: HADOOP_OPTS="$HADOOP_OPTS -agentlib:jdwp=transport=dt_socket,address=8001,server= ...

2011-05-04 03:07
浏览 703
评论(0)
分类:非技术

sources study-part 7-summary

博客分类：

hadoop sources reading

Scheme JVM Hadoop thread .net

here is my summry during reading the sources,consider to my ability and the complexity of hadoop,and i have not read all the sources,there will be some inlogical statements in them,so if you find a little uncomortable in them,tell me your ideas:) 一.概念 Map(Mapper class)是一个单独的map task,一个InputS ...

2011-05-04 02:55
浏览 814
评论(0)
分类:非技术

sources study-part 5-hdfs - advanced features - blocks allocation policy

博客分类：

hadoop sources reading

TODO

2011-05-04 02:49
浏览 803
评论(0)
分类:非技术

sources study-part 4-mapreduce - advanced features - spill,merge and sort

博客分类：

hadoop sources reading

Mapreduce

TODO

2011-05-04 02:47
浏览 828
评论(0)
分类:非技术

sources study-part 3-mapreduce -taskscheduler

博客分类：

hadoop sources reading

Mapreduce

it is used to schedule jobs and tasks also.there are certain ones : JobQueueTaskScheduler(by default,and is FIFO algorithm); FairScheduler; CapacityScheduler; ... here is the simple flow of JobQueueTaskScheduler in Mapreduce:

2011-05-04 02:44
浏览 744
评论(0)
分类:非技术

sources study-part 3-mapreduce - what is a split?

博客分类：

hadoop sources reading

Mapreduce performance

"split" which is a logical concept relatives to a "block" whis is real store unit. when a client submit a job to JT,it will compute the splits by file,than the TT will generate InputSplit to map task. so splits are used for spawn mappers ,if you use FileINputformat and se ...

2011-04-28 13:33
浏览 1039
评论(0)
分类:编程语言

sources study-part 2-hdfs get file

博客分类：

hadoop sources reading

Rack

as writing a file to hdfs,the client get a DistributedSystem to communicate with Namenode.and the DS will create a DFSClient to create DFSInputstream which is encasluted to FSDataInputStream. off course ,the input stream get a LocatedBlock which contains 10 blocks and theirs address by default a ...

2011-04-27 02:15
浏览 991
评论(0)
分类:非技术

sources study-part 2-hdfs client

博客分类：

hadoop sources reading

data structure transfrom file to hdfs from client

2011-04-24 16:43
浏览 942
评论(0)
分类:非技术

sources study-part 1-outline

博客分类：

hadoop sources reading

Hadoop C C++C#

i choice to learn the sources from the flow-work of FS to MR,so the steps are these: a.FS(IO) b.MR c.IPC outline of API: simple flow of hadoop:

2011-04-20 02:54
浏览 585
评论(0)
分类:非技术

FairScheduler in hadoop mapreduce

博客分类：

mapreduce

Mapreduce Hadoop C C++C#

as i reviewed the book,it saied: a.it 's goal is to let each user fairly share the cluster resources b.jobs are placed in pools ,and each user has their own pool . c.u can set the priorify of a pool ,so d.this Scheduler supports preemption first ,i thought,if i run a client to submit ...

2011-04-09 08:06
浏览 829
评论(0)
分类:非技术

hadoop mapreduce shuffle

博客分类：

mapreduce

Mapreduce Hadoop

it is a trick to understand the shuffle principle in hadoop.let us to learn together. if u find something wrong in this descriptions,please tell me why :) here is a 4 maps + 3 reds 1. why there are threes flows in the illustration(as marked by digit parts)? when i check the property &quo ...

2011-04-09 01:43
浏览 1387
评论(0)
分类:非技术

hadoop mapreduce workflows

博客分类：

mapreduce

Mapreduce Hadoop Workflow Java

1.simple workflow of running a job 2.MR handled by streaming or pipe process from which out of java programs 3.report job or task stauts from TT to client through JT ============ TaskTracker flow: 　　　　　　　　　　　　　　　　　　

2011-04-08 13:14
浏览 794
评论(0)
分类:非技术

ubuntu-base-commands

博客分类：

ubuntu/shell

Ubuntu 正则表达式脚本 Bash Linux

references: http://home.lupaworld.com/home-space-uid-94908-do-blog-id-32301.html http://sucre.iteye.com/blog/587673 Linux中有许多种不同的shell，通常我们使用bash (bourne again shell) 进行shell编程，因为bash不仅免费（自由）且易于使用 #!/bin/sh 符号#!用来告诉系统执行该脚本的程序,（必须放在文件的第一行）所有变量都由字符串组成，并且不需要声明。 hmod +x filename # 使其可执防止混乱 echo ...

2011-04-03 02:51
浏览 976
评论(0)
分类:非技术

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

hbase standalone install

some features of hadoop-0.20.2

sources study-part 6- remote debugging

sources study-part 7-summary

sources study-part 5-hdfs - advanced features - blocks allocation policy

sources study-part 4-mapreduce - advanced features - spill,merge and sort

sources study-part 3-mapreduce -taskscheduler

sources study-part 3-mapreduce - what is a split?

sources study-part 2-hdfs get file

sources study-part 2-hdfs client

sources study-part 1-outline

FairScheduler in hadoop mapreduce

hadoop mapreduce shuffle

hadoop mapreduce workflows

ubuntu-base-commands

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

最近访客更多访客>>