`

install hadoop-2.5 without HDFS HA /Federation

 
阅读更多

 

I. installation mode

  same as hadoop 1.x ,there are several mode to install hadoop:

1.standalone

  just run it on one machine,includes running of mapreduce.

2.pseudo 

 setup it with hdfs mode,and this case contains two types:

 a.run hdfs only 

  in this case,mapreds also run in local mode ,yes ,you can see the job name called as job_localxxxxxx

 b.run hdfs with yarn

  yes ,this is same as the distributed mode

3.distributed mode/cluster mode

  compare to item 2,this item only has some more configures and more than one  nodes. 

 

II.configures for cluster mode

file

property

value

default val

summary

core-site.xml

hadoop.tmp.dir

/usr/local/hadoop/data-2.5.1/tmp

/tmp/hadoop-${user.name}

path to a tmp dir, some sub dirs will be

filecache,usercache,nmPrivate.so thisdir shoult not set todir 'tmp' for productenvironment;

 

 

fs.defaultFS

hdfs://host1:9000

file:///

the name of the default file system.this will determine the installation mode ;the correspondent deprecated one is: fs.default.name;

The name of the default file system.  A URI whose

  scheme and authority determine the FileSystem implementation.  The

  uri's scheme determines the config property (fs.SCHEME.impl) naming

  the FileSystem implementation class.  The uri's authority is used to

 

  determine the host, port, etc. for a filesystem.

hdfs-site.xml

dfs.nameservices

hadoop-cluster1  

Comma-separated list of nameservices.here is single NN only but HA

 

dfs.namenode.secondary.http-address

host1:50090

0.0.0.0:50090

The secondary namenode http server address and port.

 

dfs.namenode.name.dir

file:///usr/local/hadoop/data-2.5.1/dfs/name

file://${hadoop.tmp.dir}/dfs/name

Determines where on the local filesystem the DFS name node

      should store the name table(fsimage).  If this is a comma-delimited list

      of directories then the name table is replicated in all of the

      directories, for redundancy.

 

dfs.datanode.data.dir

file:///usr/local/hadoop/data-2.5.1/dfs/data

file://${hadoop.tmp.dir}/dfs/data

Determines where on the local filesystem an DFS data node

  should store its blocks.  If this is a comma-delimited

  list of directories, then data will be stored in all named

  directories, typically on different devices.

  Directories that do not exist are ignored.

 

dfs.replication

1 3 the replication factor to assign data blocks
 

dfs.webhdfs.enabled

true true

Enable WebHDFS (REST API) in Namenodes and Datanodes.

yarn-site.xml  yarn.nodemanager.aux-services  mapreduce_shuffle  the auxiliary service name  the valid service name should only contain a-zA-Z0-9_ and can not start with numbers 
   yarn.resourcemanager.address  host1:8032  ${yarn.resourcemanager.hostname}:8032  The address of the applications manager interface in the RM
   yarn.resourcemanager.scheduler.address  host1:8030  ${yarn.resourcemanager.hostname}:8030  the scheduler address of RM
   yarn.resourcemanager.resource-tracker.address host1:8031   ${yarn.resourcemanager.hostname}:8031  
   yarn.resourcemanager.admin.address  host1:8033  ${yarn.resourcemanager.hostname}:8033  admin addr
   yarn.resourcemanager.webapp.address  host1:50030  ${yarn.resourcemanager.hostname}:8088  the webp ui addr for RM ;here is set to job tracker addr that same as hadoop 1.x 
 mapred-site.xml mapreduce.framework.name  yarn   local

The runtime framework for executing MapReduce jobs.

  Can be one of local, classic or yarn. 
   mapreduce.jobhistory.address  host1:10020  0.0.0.0:10020  MapReduce JobHistory Server IPC host:port
   mapreduce.jobhistory.webapp.address host1:19888  0.0.0.0:19888  MapReduce JobHistory Server Web UI host:port
         

 

 

 

 

 

 

 

 

III.the results of running MR in yarn

below are logs from mapreduce run with pseudo mode:

hadoop@ubuntu:/usr/local/hadoop/hadoop-2.5.1$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount wc wc-out

14/11/05 18:19:23 INFO client.RMProxy: Connecting to ResourceManager at namenode/192.168.1.25:8032

14/11/05 18:19:24 INFO input.FileInputFormat: Total input paths to process : 22

14/11/05 18:19:24 INFO mapreduce.JobSubmitter: number of splits:22

14/11/05 18:19:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415182439385_0001

14/11/05 18:19:25 INFO impl.YarnClientImpl: Submitted application application_1415182439385_0001

14/11/05 18:19:25 INFO mapreduce.Job: The url to track the job: http://namenode:50030/proxy/application_1415182439385_0001/

14/11/05 18:19:25 INFO mapreduce.Job: Running job: job_1415182439385_0001

14/11/05 18:19:32 INFO mapreduce.Job: Job job_1415182439385_0001 running in uber mode : false

14/11/05 18:19:32 INFO mapreduce.Job:  map 0% reduce 0%

14/11/05 18:19:44 INFO mapreduce.Job:  map 9% reduce 0%

14/11/05 18:19:45 INFO mapreduce.Job:  map 27% reduce 0%

14/11/05 18:19:54 INFO mapreduce.Job:  map 32% reduce 0%

14/11/05 18:19:55 INFO mapreduce.Job:  map 45% reduce 0%

14/11/05 18:19:56 INFO mapreduce.Job:  map 50% reduce 0%

14/11/05 18:20:02 INFO mapreduce.Job:  map 55% reduce 17%

14/11/05 18:20:03 INFO mapreduce.Job:  map 59% reduce 17%

14/11/05 18:20:05 INFO mapreduce.Job:  map 68% reduce 20%

14/11/05 18:20:06 INFO mapreduce.Job:  map 73% reduce 20%

14/11/05 18:20:08 INFO mapreduce.Job:  map 73% reduce 24%

14/11/05 18:20:11 INFO mapreduce.Job:  map 77% reduce 24%

14/11/05 18:20:12 INFO mapreduce.Job:  map 82% reduce 24%

14/11/05 18:20:13 INFO mapreduce.Job:  map 91% reduce 24%

14/11/05 18:20:14 INFO mapreduce.Job:  map 95% reduce 30%

14/11/05 18:20:16 INFO mapreduce.Job:  map 100% reduce 30%

14/11/05 18:20:17 INFO mapreduce.Job:  map 100% reduce 100%

14/11/05 18:20:18 INFO mapreduce.Job: Job job_1415182439385_0001 completed successfully

14/11/05 18:20:18 INFO mapreduce.Job: Counters: 49

File System Counters

FILE: Number of bytes read=54637

FILE: Number of bytes written=2338563

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=59677

HDFS: Number of bytes written=28233

HDFS: Number of read operations=69

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters 

Launched map tasks=22

Launched reduce tasks=1

Data-local map tasks=22

Total time spent by all maps in occupied slots (ms)=185554

Total time spent by all reduces in occupied slots (ms)=30206

Total time spent by all map tasks (ms)=185554

Total time spent by all reduce tasks (ms)=30206

Total vcore-seconds taken by all map tasks=185554

Total vcore-seconds taken by all reduce tasks=30206

Total megabyte-seconds taken by all map tasks=190007296

Total megabyte-seconds taken by all reduce tasks=30930944

Map-Reduce Framework

Map input records=1504

Map output records=5727

Map output bytes=77326

Map output materialized bytes=54763

Input split bytes=2498

Combine input records=5727

Combine output records=2838

Reduce input groups=1224

Reduce shuffle bytes=54763

Reduce input records=2838

Reduce output records=1224

Spilled Records=5676

Shuffled Maps =22

Failed Shuffles=0

Merged Map outputs=22

GC time elapsed (ms)=1707

CPU time spent (ms)=14500

Physical memory (bytes) snapshot=5178937344

Virtual memory (bytes) snapshot=22517506048

Total committed heap usage (bytes)=3882549248

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters 

Bytes Read=57179

File Output Format Counters 

Bytes Written=28233

 

FAQs

1.2014-01-22 09:38:20,733 INFO  [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:transition(788)) - Application application_1390354688375_0001 failed 2 times due to AM Container for appattempt_1390354688375_0001_000002 exited with  exitCode: 127 due to: Exception from container-launch: 

  this maybe occur if you dont setup a JAVA_HOME in yarn-env.sh and hadoop-env.sh,and remember to restart yarn:)

2.occurs two jobs by running 'grep' example

  it's normal!at first ,i think it's some wrong,but when i run wordcount again,the result shows one job only .so i think it's the nature of this example.

 

ref:

apache install hadoop 2

 

 

 

0
4
分享到:
评论

相关推荐

    hadoop-hdfs-client-2.9.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-client-2.9.1.jar; 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-client-2.9.1.pom;...

    hadoop-hdfs-client-2.9.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-client-2.9.1.jar 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-client-2.9.1-javadoc-...

    hadoop最新版本3.1.1全量jar包

    hadoop-auth-3.1.1.jar hadoop-hdfs-3.1.1.jar hadoop-mapreduce-client-hs-3.1.1.jar hadoop-yarn-client-3.1.1.jar hadoop-client-api-3.1.1.jar hadoop-hdfs-client-3.1.1.jar hadoop-mapreduce-client-jobclient...

    hadoop-idea-hdfs插件.zip

    这款名为“hadoop-idea-hdfs插件”的插件,与Eclipse上的Hadoop hdfs插件类似,为IDEA带来了强大的HDFS操作支持。它的主要功能在于提供直接在IDE内部与HDFS进行交互的能力,免去了频繁切换到命令行界面的麻烦,极大...

    hadoop-hdfs-2.7.3-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...

    hadoop-fs指令学习

    关于提供的资料文件 "hadoop-fs指令学习@www.java1234.com.pdf",这很可能是详细的教程文档,涵盖了上述及更多`hadoop fs`命令的用法和实例。建议仔细阅读这份文档,以便深入理解和掌握Hadoop文件系统的操作。

    Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码

    Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】--...

    Hadoop fs命令详解.docx

    Hadoop fs命令是Hadoop分布式文件系统(HDFS)的命令行接口,提供了丰富的文件操作命令,方便用户管理和维护HDFS文件系统。本文将详细介绍Hadoop fs命令的使用方法和实践操作。 基本命令 hadoop fs命令的基本语法...

    hadoop-hdfs-2.5.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...

    Hadoop原理与技术hdfs命令行基本操作

    Hadoop环境 Jdk1.8 三、实验内容 1:hdfs常见命令: (1)查看帮助:hdfs dfs -help (2)查看当前目录信息:hdfs dfs -ls / (3)创建文件夹:hdfs dfs -mkdir /文件夹名 (4)上传文件:hdfs dfs -put /...

    02-Hadoop-HDFS.docx

    ### Hadoop-HDFS知识点解析 #### 一、HDFS概述 **1.1 HDFS产出背景及定义** HDFS(Hadoop Distributed File System)是一种分布式文件系统,它为Hadoop的大数据处理提供了高效的存储能力。随着互联网的发展,数据...

    hadoop-fuse-dfs安装.docx

    yum install hadoop-hdfs-fuse ``` - 在Ubuntu系统中,则使用以下命令进行安装: ```bash apt-get install hadoop-hdfs-fuse ``` 3. **替换配置文件**: - 下载CDH集群中的HDFS客户端配置文件,并将`core-...

    hadoop-hdfs-2.6.5-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...

    hadoop-2.5 windows 64位

    在"主文件.txt"中,很可能是关于如何在Windows 64位环境下编译和安装Hadoop 2.5.0的详细步骤,包括下载源代码、配置编译环境、修改相关配置文件(如hadoop-env.sh、core-site.xml、hdfs-site.xml、mapred-site.xml)...

    hadoop-2.2.0-src.tar

    Binary Compatibility for MapReduce applications built on hadoop-1.x Substantial amount of integration testing with rest of projects in the ecosystem A couple of important points to note while ...

    hadoop-3.1.3.tar.gz

    下载后,使用`tar -zxvf hadoop-3.1.3.tar.gz`命令进行解压,解压后的目录结构包含Hadoop的各种组件和配置文件。 三、配置Hadoop环境 为了方便使用Hadoop,我们需要设置环境变量。在用户的.bashrc文件中添加以下...

    hadoop-hdfs-2.7.3-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...

    HDFS.zip_Hadoop 平台_hadoop_hdfs

    HDFS(Hadoop Distributed File System)则是Hadoop的核心组件之一,负责数据的分布式存储。本篇将深入探讨Hadoop平台上的HDFS,以及如何在该平台上进行文件操作。 一、Hadoop平台基础 Hadoop是基于Java开发的,它...

    hadoop-hdfs-2.5.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...

Global site tag (gtag.js) - Google Analytics