- 浏览: 192643 次
- 性别:
文章分类
最新评论
Hadoop学习笔记(二)设置单节点集群
本文描述如何设置一个单一节点的Hadoop安装,以便您可以快速执行简单的操作,使用HadoopMapReduce和Hadoop分布式文件系统(HDFS)。
参考官方文档:Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
Hadoop版本:Apache Hadoop 2.5.1
系统版本:CentOS 6.5,内核(uname -r):2.6.32-431.el6.x86_64
系统必备组件
支持的系统平台
GNU/Linux作为开发和生产的平台,毫无疑问。Windows也是受支持的平台,但是以下步骤仅用于Linux。
依赖的软件
在Linux系统上安装所需要的软件包
1、JAVA(JDK)必须安装,推荐的版本请参考Hadoop JAVA Version,我这里安装的是1.7。
2、ssh必须安装,必须运行sshd才能使用管理远程Hadoop守护程序的Hadoop脚本。
安装依赖的软件
如果您的系统没有所需的软件,您将需要安装它。
例如在Ubuntu Linux上使用以下命令:
$ sudo apt-get install ssh $ sudo apt-get install rsync
CentOS应该是即使是最小安装也带了ssh(Secure Shell),刚开始我给弄混了,以为是JAVA的SSH(Spring + Struts +Hibernate),汗!
安装JDK,参考:CentOS下安装JDK7
下载
就不多说了,上一篇下过了。链接:Hadoop学习笔记(一)从官网下载安装包
准备启动 Hadoop 集群
解压文件hadoop-2.5.1.tar.gz,执行:tar xvfhadoop-2.5.1.tar.gz,会将文件解压到hadoop-2.5.1目录下;
切换目录:cd hadoop-2.5.1/etc/hadoop/
编辑“hadoop-env.sh”文件,添加参考定义;
vihadoop-env.sh
个人觉得比较好的习惯是编辑文件之前先做个备份(cp hadoop-env.shhadoop-env.sh.bak);
找到以下位置:
# The java implementation to use. export JAVA_HOME={JAVA_HOME}将其改为:
# The java implementation to use. export JAVA_HOME=/usr/java/latest在下面再添加一句:
# Assuming your installation directory is /usr/local/hadoop export HADOOP_PREFIX=/usr/local/hadoop保存并退出,ESC,:wq
切换目录(cd ../..),返回“/opt/hadoop-2.5.1”;
尝试执行以下命令:
./bin/hadoop
这将显示 hadoop 脚本的使用文档,输出如下:
Usage: hadoop [--config confdir] COMMAND where COMMAND is one of: fs run a generic filesystem user client version print the version jar <jar> run a jar file checknative [-a|-h] check native hadoop and compression libraries availability distcp <srcurl> <desturl> copy file or directories recursively archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive classpath prints the class path needed to get the Hadoop jar and the required libraries daemonlog get/set the log level for each daemon or CLASSNAME run the class named CLASSNAME Most commands print help when invoked w/o parameters.
- 本地 (独立) 模式
- 伪分布的模式
- 完全分布式模式
本地模式操作方法
默认情况下,Hadoop 被配置为运行在非分布式模式下,作为一个单一的 Java 进程。这比较适合用于调试。
下面的示例复制要使用作为输入的解压缩的 conf 目录,然后查找并显示给定正则表达式的每一场比赛。输出被写入给定的输出目录。
$ mkdir input $ cp etc/hadoop/*.xml input $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+' $ cat output/*
执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”时
却出现错误:Error: Could not find or load main class org.apache.hadoop.util.RunJar
此问题只在Stack Overflow上见到
What does “Error: Could not find or load main class org.apache.hadoop.util.RunJar”?
但是也没能找到解决的办法;还是自己摸索吧!
解决步骤:
刚刚备份的“hadoop-env.sh”文件现在用上了,还原它。
再执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”,
提示:
./bin/hadoop: line 133: /usr/java/jdk1.7.0/bin/java: No such file or directory ./bin/hadoop: line 133: exec: /usr/java/jdk1.7.0/bin/java: cannot execute: No such file or directory按提示应该还是JAVA(JDK)的安装的问题,我安装JDK的时候只执行到
rpm -ivh /目录/jdk-7-linux-x64.rpm
再没执行其它操作,将后续的步骤执行完成后,再执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”,
输出:
14/10/07 03:35:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/10/07 03:35:58 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 14/10/07 03:35:58 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 14/10/07 03:35:59 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 14/10/07 03:35:59 INFO input.FileInputFormat: Total input paths to process : 6 14/10/07 03:35:59 INFO mapreduce.JobSubmitter: number of splits:6 14/10/07 03:36:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1185570365_0001 14/10/07 03:36:00 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root1185570365/.staging/job_local1185570365_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/10/07 03:36:01 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root1185570365/.staging/job_local1185570365_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/10/07 03:36:01 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local1185570365_0001/job_local1185570365_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/10/07 03:36:01 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local1185570365_0001/job_local1185570365_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/10/07 03:36:01 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 14/10/07 03:36:01 INFO mapreduce.Job: Running job: job_local1185570365_0001 14/10/07 03:36:01 INFO mapred.LocalJobRunner: OutputCommitter set in config null 14/10/07 03:36:01 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 14/10/07 03:36:02 INFO mapred.LocalJobRunner: Waiting for map tasks 14/10/07 03:36:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000000_0 14/10/07 03:36:02 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/07 03:36:02 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hadoop-policy.xml:0+9201 14/10/07 03:36:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/07 03:36:02 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/07 03:36:02 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/07 03:36:02 INFO mapred.MapTask: soft limit at 83886080 14/10/07 03:36:02 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/07 03:36:02 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/07 03:36:02 INFO mapred.LocalJobRunner: 14/10/07 03:36:02 INFO mapred.MapTask: Starting flush of map output 14/10/07 03:36:02 INFO mapred.MapTask: Spilling map output 14/10/07 03:36:02 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 104857600 14/10/07 03:36:02 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600 14/10/07 03:36:02 INFO mapreduce.Job: Job job_local1185570365_0001 running in uber mode : false 14/10/07 03:36:02 INFO mapred.MapTask: Finished spill 0 14/10/07 03:36:02 INFO mapreduce.Job: map 0% reduce 0% 14/10/07 03:36:02 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000000_0 is done. And is in the process of committing 14/10/07 03:36:02 INFO mapred.LocalJobRunner: map 14/10/07 03:36:02 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000000_0' done. 14/10/07 03:36:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000000_0 14/10/07 03:36:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000001_0 14/10/07 03:36:02 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/07 03:36:02 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/capacity-scheduler.xml:0+3589 14/10/07 03:36:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/07 03:36:02 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/07 03:36:02 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/07 03:36:02 INFO mapred.MapTask: soft limit at 83886080 14/10/07 03:36:02 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/07 03:36:02 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/07 03:36:02 INFO mapred.LocalJobRunner: 14/10/07 03:36:02 INFO mapred.MapTask: Starting flush of map output 14/10/07 03:36:02 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000001_0 is done. And is in the process of committing 14/10/07 03:36:02 INFO mapred.LocalJobRunner: map 14/10/07 03:36:02 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000001_0' done. 14/10/07 03:36:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000001_0 14/10/07 03:36:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000002_0 14/10/07 03:36:02 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/07 03:36:02 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hdfs-site.xml:0+775 14/10/07 03:36:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/07 03:36:03 INFO mapred.MapTask: soft limit at 83886080 14/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output 14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000002_0 is done. And is in the process of committing 14/10/07 03:36:03 INFO mapred.LocalJobRunner: map 14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000002_0' done. 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000002_0 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000003_0 14/10/07 03:36:03 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/07 03:36:03 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/core-site.xml:0+774 14/10/07 03:36:03 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/07 03:36:03 INFO mapred.MapTask: soft limit at 83886080 14/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output 14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000003_0 is done. And is in the process of committing 14/10/07 03:36:03 INFO mapred.LocalJobRunner: map 14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000003_0' done. 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000003_0 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000004_0 14/10/07 03:36:03 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/07 03:36:03 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/yarn-site.xml:0+690 14/10/07 03:36:03 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/07 03:36:03 INFO mapred.MapTask: soft limit at 83886080 14/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output 14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000004_0 is done. And is in the process of committing 14/10/07 03:36:03 INFO mapred.LocalJobRunner: map 14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000004_0' done. 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000004_0 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000005_0 14/10/07 03:36:03 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/07 03:36:03 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/httpfs-site.xml:0+620 14/10/07 03:36:03 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/07 03:36:03 INFO mapred.MapTask: soft limit at 83886080 14/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output 14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000005_0 is done. And is in the process of committing 14/10/07 03:36:03 INFO mapred.LocalJobRunner: map 14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000005_0' done. 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000005_0 14/10/07 03:36:03 INFO mapred.LocalJobRunner: map task executor complete. 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Waiting for reduce tasks 14/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_r_000000_0 14/10/07 03:36:03 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/07 03:36:03 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@57931be2 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10 14/10/07 03:36:03 INFO reduce.EventFetcher: attempt_local1185570365_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000001_0 decomp: 2 len: 6 to MEMORY 14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000001_0 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->2 14/10/07 03:36:03 INFO mapreduce.Job: map 100% reduce 0% 14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000004_0 decomp: 2 len: 6 to MEMORY 14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000004_0 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 2, commitMemory -> 2, usedMemory ->4 14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000005_0 decomp: 2 len: 6 to MEMORY 14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000005_0 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 3, commitMemory -> 4, usedMemory ->6 14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000002_0 decomp: 2 len: 6 to MEMORY 14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000002_0 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 4, commitMemory -> 6, usedMemory ->8 14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000003_0 decomp: 2 len: 6 to MEMORY 14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000003_0 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 5, commitMemory -> 8, usedMemory ->10 14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000000_0 decomp: 21 len: 25 to MEMORY 14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local1185570365_0001_m_000000_0 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 6, commitMemory -> 10, usedMemory ->31 14/10/07 03:36:03 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning 14/10/07 03:36:03 INFO mapred.LocalJobRunner: 6 / 6 copied. 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: finalMerge called with 6 in-memory map-outputs and 0 on-disk map-outputs 14/10/07 03:36:03 INFO mapred.Merger: Merging 6 sorted segments 14/10/07 03:36:03 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: Merged 6 segments, 31 bytes to disk to satisfy reduce memory limit 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk 14/10/07 03:36:03 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce 14/10/07 03:36:03 INFO mapred.Merger: Merging 1 sorted segments 14/10/07 03:36:03 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes 14/10/07 03:36:03 INFO mapred.LocalJobRunner: 6 / 6 copied. 14/10/07 03:36:04 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords 14/10/07 03:36:04 INFO mapred.Task: Task:attempt_local1185570365_0001_r_000000_0 is done. And is in the process of committing 14/10/07 03:36:04 INFO mapred.LocalJobRunner: 6 / 6 copied. 14/10/07 03:36:04 INFO mapred.Task: Task attempt_local1185570365_0001_r_000000_0 is allowed to commit now 14/10/07 03:36:04 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1185570365_0001_r_000000_0' to file:/opt/hadoop-2.5.1/grep-temp-767563685/_temporary/0/task_local1185570365_0001_r_000000 14/10/07 03:36:04 INFO mapred.LocalJobRunner: reduce > reduce 14/10/07 03:36:04 INFO mapred.Task: Task 'attempt_local1185570365_0001_r_000000_0' done. 14/10/07 03:36:04 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_r_000000_0 14/10/07 03:36:04 INFO mapred.LocalJobRunner: reduce task executor complete. 14/10/07 03:36:04 INFO mapreduce.Job: map 100% reduce 100% 14/10/07 03:36:04 INFO mapreduce.Job: Job job_local1185570365_0001 completed successfully 14/10/07 03:36:04 INFO mapreduce.Job: Counters: 33 File System Counters FILE: Number of bytes read=114663 FILE: Number of bytes written=1613316 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=405 Map output records=1 Map output bytes=17 Map output materialized bytes=55 Input split bytes=657 Combine input records=1 Combine output records=1 Reduce input groups=1 Reduce shuffle bytes=55 Reduce input records=1 Reduce output records=1 Spilled Records=2 Shuffled Maps =6 Failed Shuffles=0 Merged Map outputs=6 GC time elapsed (ms)=225 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=1106100224 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=15649 File Output Format Counters Bytes Written=123 14/10/07 03:36:04 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/opt/hadoop-2.5.1/output already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303) at org.apache.hadoop.examples.Grep.run(Grep.java:92) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.Grep.main(Grep.java:101) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Outputdirectoryfile:/opt/hadoop-2.5.1/outputalreadyexists,噢,原因是output目录已经存在了(之前我排查问题的时候创建的);
删除output目录(rm -rf output);
再执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”命令,输出如下:
14/10/08 05:57:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/10/08 05:57:35 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 14/10/08 05:57:35 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 14/10/08 05:57:36 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 14/10/08 05:57:36 INFO input.FileInputFormat: Total input paths to process : 6 14/10/08 05:57:36 INFO mapreduce.JobSubmitter: number of splits:6 14/10/08 05:57:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local380762736_0001 14/10/08 05:57:37 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root380762736/.staging/job_local380762736_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/10/08 05:57:37 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root380762736/.staging/job_local380762736_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/10/08 05:57:38 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local380762736_0001/job_local380762736_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/10/08 05:57:38 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local380762736_0001/job_local380762736_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/10/08 05:57:38 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 14/10/08 05:57:38 INFO mapreduce.Job: Running job: job_local380762736_0001 14/10/08 05:57:38 INFO mapred.LocalJobRunner: OutputCommitter set in config null 14/10/08 05:57:38 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 14/10/08 05:57:38 INFO mapred.LocalJobRunner: Waiting for map tasks 14/10/08 05:57:38 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000000_0 14/10/08 05:57:39 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:39 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hadoop-policy.xml:0+9201 14/10/08 05:57:39 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/08 05:57:39 INFO mapreduce.Job: Job job_local380762736_0001 running in uber mode : false 14/10/08 05:57:39 INFO mapreduce.Job: map 0% reduce 0% 14/10/08 05:57:43 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/08 05:57:43 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/08 05:57:43 INFO mapred.MapTask: soft limit at 83886080 14/10/08 05:57:43 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/08 05:57:43 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/08 05:57:44 INFO mapred.LocalJobRunner: 14/10/08 05:57:44 INFO mapred.MapTask: Starting flush of map output 14/10/08 05:57:44 INFO mapred.MapTask: Spilling map output 14/10/08 05:57:44 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 104857600 14/10/08 05:57:44 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600 14/10/08 05:57:44 INFO mapred.MapTask: Finished spill 0 14/10/08 05:57:44 INFO mapred.Task: Task:attempt_local380762736_0001_m_000000_0 is done. And is in the process of committing 14/10/08 05:57:45 INFO mapred.LocalJobRunner: map 14/10/08 05:57:45 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000000_0' done. 14/10/08 05:57:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000000_0 14/10/08 05:57:45 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000001_0 14/10/08 05:57:45 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:45 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/capacity-scheduler.xml:0+3589 14/10/08 05:57:45 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/08 05:57:45 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/08 05:57:45 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/08 05:57:45 INFO mapred.MapTask: soft limit at 83886080 14/10/08 05:57:45 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/08 05:57:45 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/08 05:57:45 INFO mapred.LocalJobRunner: 14/10/08 05:57:45 INFO mapred.MapTask: Starting flush of map output 14/10/08 05:57:45 INFO mapred.Task: Task:attempt_local380762736_0001_m_000001_0 is done. And is in the process of committing 14/10/08 05:57:45 INFO mapred.LocalJobRunner: map 14/10/08 05:57:45 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000001_0' done. 14/10/08 05:57:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000001_0 14/10/08 05:57:45 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000002_0 14/10/08 05:57:45 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:45 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hdfs-site.xml:0+775 14/10/08 05:57:45 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/08 05:57:46 INFO mapreduce.Job: map 100% reduce 0% 14/10/08 05:57:46 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/08 05:57:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/08 05:57:46 INFO mapred.MapTask: soft limit at 83886080 14/10/08 05:57:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/08 05:57:46 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/08 05:57:46 INFO mapred.LocalJobRunner: 14/10/08 05:57:46 INFO mapred.MapTask: Starting flush of map output 14/10/08 05:57:46 INFO mapred.Task: Task:attempt_local380762736_0001_m_000002_0 is done. And is in the process of committing 14/10/08 05:57:46 INFO mapred.LocalJobRunner: map 14/10/08 05:57:46 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000002_0' done. 14/10/08 05:57:46 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000002_0 14/10/08 05:57:46 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000003_0 14/10/08 05:57:46 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:46 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/core-site.xml:0+774 14/10/08 05:57:46 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/08 05:57:47 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/08 05:57:47 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/08 05:57:47 INFO mapred.MapTask: soft limit at 83886080 14/10/08 05:57:47 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/08 05:57:47 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/08 05:57:47 INFO mapred.LocalJobRunner: 14/10/08 05:57:47 INFO mapred.MapTask: Starting flush of map output 14/10/08 05:57:47 INFO mapred.Task: Task:attempt_local380762736_0001_m_000003_0 is done. And is in the process of committing 14/10/08 05:57:47 INFO mapred.LocalJobRunner: map 14/10/08 05:57:47 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000003_0' done. 14/10/08 05:57:47 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000003_0 14/10/08 05:57:47 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000004_0 14/10/08 05:57:47 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:47 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/yarn-site.xml:0+690 14/10/08 05:57:47 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/08 05:57:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/08 05:57:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/08 05:57:49 INFO mapred.MapTask: soft limit at 83886080 14/10/08 05:57:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/08 05:57:49 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/08 05:57:49 INFO mapred.LocalJobRunner: 14/10/08 05:57:49 INFO mapred.MapTask: Starting flush of map output 14/10/08 05:57:49 INFO mapred.Task: Task:attempt_local380762736_0001_m_000004_0 is done. And is in the process of committing 14/10/08 05:57:49 INFO mapred.LocalJobRunner: map 14/10/08 05:57:49 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000004_0' done. 14/10/08 05:57:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000004_0 14/10/08 05:57:49 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000005_0 14/10/08 05:57:49 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:49 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/httpfs-site.xml:0+620 14/10/08 05:57:49 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/08 05:57:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/08 05:57:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/08 05:57:49 INFO mapred.MapTask: soft limit at 83886080 14/10/08 05:57:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/08 05:57:49 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/08 05:57:49 INFO mapred.LocalJobRunner: 14/10/08 05:57:49 INFO mapred.MapTask: Starting flush of map output 14/10/08 05:57:49 INFO mapred.Task: Task:attempt_local380762736_0001_m_000005_0 is done. And is in the process of committing 14/10/08 05:57:49 INFO mapred.LocalJobRunner: map 14/10/08 05:57:49 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000005_0' done. 14/10/08 05:57:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000005_0 14/10/08 05:57:49 INFO mapred.LocalJobRunner: map task executor complete. 14/10/08 05:57:49 INFO mapred.LocalJobRunner: Waiting for reduce tasks 14/10/08 05:57:49 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_r_000000_0 14/10/08 05:57:49 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:49 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@6d36df08 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10 14/10/08 05:57:50 INFO reduce.EventFetcher: attempt_local380762736_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 14/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000000_0 decomp: 21 len: 25 to MEMORY 14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local380762736_0001_m_000000_0 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->21 14/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000004_0 decomp: 2 len: 6 to MEMORY 14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000004_0 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 2, commitMemory -> 21, usedMemory ->23 14/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000003_0 decomp: 2 len: 6 to MEMORY 14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000003_0 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 3, commitMemory -> 23, usedMemory ->25 14/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000005_0 decomp: 2 len: 6 to MEMORY 14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000005_0 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 4, commitMemory -> 25, usedMemory ->27 14/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000001_0 decomp: 2 len: 6 to MEMORY 14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000001_0 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 5, commitMemory -> 27, usedMemory ->29 14/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000002_0 decomp: 2 len: 6 to MEMORY 14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000002_0 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 6, commitMemory -> 29, usedMemory ->31 14/10/08 05:57:50 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning 14/10/08 05:57:50 INFO mapred.LocalJobRunner: 6 / 6 copied. 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: finalMerge called with 6 in-memory map-outputs and 0 on-disk map-outputs 14/10/08 05:57:50 INFO mapred.Merger: Merging 6 sorted segments 14/10/08 05:57:50 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: Merged 6 segments, 31 bytes to disk to satisfy reduce memory limit 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk 14/10/08 05:57:50 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce 14/10/08 05:57:50 INFO mapred.Merger: Merging 1 sorted segments 14/10/08 05:57:50 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes 14/10/08 05:57:50 INFO mapred.LocalJobRunner: 6 / 6 copied. 14/10/08 05:57:50 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords 14/10/08 05:57:50 INFO mapred.Task: Task:attempt_local380762736_0001_r_000000_0 is done. And is in the process of committing 14/10/08 05:57:50 INFO mapred.LocalJobRunner: 6 / 6 copied. 14/10/08 05:57:50 INFO mapred.Task: Task attempt_local380762736_0001_r_000000_0 is allowed to commit now 14/10/08 05:57:50 INFO output.FileOutputCommitter: Saved output of task 'attempt_local380762736_0001_r_000000_0' to file:/opt/hadoop-2.5.1/grep-temp-913340630/_temporary/0/task_local380762736_0001_r_000000 14/10/08 05:57:50 INFO mapred.LocalJobRunner: reduce > reduce 14/10/08 05:57:50 INFO mapred.Task: Task 'attempt_local380762736_0001_r_000000_0' done. 14/10/08 05:57:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_r_000000_0 14/10/08 05:57:50 INFO mapred.LocalJobRunner: reduce task executor complete. 14/10/08 05:57:51 INFO mapreduce.Job: map 100% reduce 100% 14/10/08 05:57:51 INFO mapreduce.Job: Job job_local380762736_0001 completed successfully 14/10/08 05:57:51 INFO mapreduce.Job: Counters: 33 File System Counters FILE: Number of bytes read=114663 FILE: Number of bytes written=1604636 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=405 Map output records=1 Map output bytes=17 Map output materialized bytes=55 Input split bytes=657 Combine input records=1 Combine output records=1 Reduce input groups=1 Reduce shuffle bytes=55 Reduce input records=1 Reduce output records=1 Spilled Records=2 Shuffled Maps =6 Failed Shuffles=0 Merged Map outputs=6 GC time elapsed (ms)=2359 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=1106096128 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=15649 File Output Format Counters Bytes Written=123 14/10/08 05:57:51 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 14/10/08 05:57:51 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 14/10/08 05:57:51 INFO input.FileInputFormat: Total input paths to process : 1 14/10/08 05:57:51 INFO mapreduce.JobSubmitter: number of splits:1 14/10/08 05:57:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local571678604_0002 14/10/08 05:57:51 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root571678604/.staging/job_local571678604_0002/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/10/08 05:57:51 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root571678604/.staging/job_local571678604_0002/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/10/08 05:57:52 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local571678604_0002/job_local571678604_0002.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/10/08 05:57:52 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local571678604_0002/job_local571678604_0002.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/10/08 05:57:52 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 14/10/08 05:57:52 INFO mapreduce.Job: Running job: job_local571678604_0002 14/10/08 05:57:52 INFO mapred.LocalJobRunner: OutputCommitter set in config null 14/10/08 05:57:52 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 14/10/08 05:57:52 INFO mapred.LocalJobRunner: Waiting for map tasks 14/10/08 05:57:52 INFO mapred.LocalJobRunner: Starting task: attempt_local571678604_0002_m_000000_0 14/10/08 05:57:52 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:52 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/grep-temp-913340630/part-r-00000:0+111 14/10/08 05:57:52 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 14/10/08 05:57:52 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 14/10/08 05:57:52 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 14/10/08 05:57:52 INFO mapred.MapTask: soft limit at 83886080 14/10/08 05:57:52 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 14/10/08 05:57:52 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 14/10/08 05:57:52 INFO mapred.LocalJobRunner: 14/10/08 05:57:52 INFO mapred.MapTask: Starting flush of map output 14/10/08 05:57:52 INFO mapred.MapTask: Spilling map output 14/10/08 05:57:52 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 104857600 14/10/08 05:57:52 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600 14/10/08 05:57:52 INFO mapred.MapTask: Finished spill 0 14/10/08 05:57:52 INFO mapred.Task: Task:attempt_local571678604_0002_m_000000_0 is done. And is in the process of committing 14/10/08 05:57:52 INFO mapred.LocalJobRunner: map 14/10/08 05:57:52 INFO mapred.Task: Task 'attempt_local571678604_0002_m_000000_0' done. 14/10/08 05:57:52 INFO mapred.LocalJobRunner: Finishing task: attempt_local571678604_0002_m_000000_0 14/10/08 05:57:52 INFO mapred.LocalJobRunner: map task executor complete. 14/10/08 05:57:52 INFO mapred.LocalJobRunner: Waiting for reduce tasks 14/10/08 05:57:52 INFO mapred.LocalJobRunner: Starting task: attempt_local571678604_0002_r_000000_0 14/10/08 05:57:52 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/08 05:57:52 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@63ae8b5c 14/10/08 05:57:52 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10 14/10/08 05:57:52 INFO reduce.EventFetcher: attempt_local571678604_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 14/10/08 05:57:52 INFO reduce.LocalFetcher: localfetcher#2 about to shuffle output of map attempt_local571678604_0002_m_000000_0 decomp: 21 len: 25 to MEMORY 14/10/08 05:57:52 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local571678604_0002_m_000000_0 14/10/08 05:57:52 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->21 14/10/08 05:57:52 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning 14/10/08 05:57:52 INFO mapred.LocalJobRunner: 1 / 1 copied. 14/10/08 05:57:52 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs 14/10/08 05:57:52 INFO mapred.Merger: Merging 1 sorted segments 14/10/08 05:57:52 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes 14/10/08 05:57:52 INFO reduce.MergeManagerImpl: Merged 1 segments, 21 bytes to disk to satisfy reduce memory limit 14/10/08 05:57:52 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk 14/10/08 05:57:52 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce 14/10/08 05:57:52 INFO mapred.Merger: Merging 1 sorted segments 14/10/08 05:57:52 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes 14/10/08 05:57:52 INFO mapred.LocalJobRunner: 1 / 1 copied. 14/10/08 05:57:52 INFO mapred.Task: Task:attempt_local571678604_0002_r_000000_0 is done. And is in the process of committing 14/10/08 05:57:52 INFO mapred.LocalJobRunner: 1 / 1 copied. 14/10/08 05:57:52 INFO mapred.Task: Task attempt_local571678604_0002_r_000000_0 is allowed to commit now 14/10/08 05:57:52 INFO output.FileOutputCommitter: Saved output of task 'attempt_local571678604_0002_r_000000_0' to file:/opt/hadoop-2.5.1/output/_temporary/0/task_local571678604_0002_r_000000 14/10/08 05:57:52 INFO mapred.LocalJobRunner: reduce > reduce 14/10/08 05:57:52 INFO mapred.Task: Task 'attempt_local571678604_0002_r_000000_0' done. 14/10/08 05:57:52 INFO mapred.LocalJobRunner: Finishing task: attempt_local571678604_0002_r_000000_0 14/10/08 05:57:52 INFO mapred.LocalJobRunner: reduce task executor complete. 14/10/08 05:57:53 INFO mapreduce.Job: Job job_local571678604_0002 running in uber mode : false 14/10/08 05:57:53 INFO mapreduce.Job: map 100% reduce 100% 14/10/08 05:57:53 INFO mapreduce.Job: Job job_local571678604_0002 completed successfully 14/10/08 05:57:53 INFO mapreduce.Job: Counters: 33 File System Counters FILE: Number of bytes read=39892 FILE: Number of bytes written=913502 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=1 Map output records=1 Map output bytes=17 Map output materialized bytes=25 Input split bytes=120 Combine input records=0 Combine output records=0 Reduce input groups=1 Reduce shuffle bytes=25 Reduce input records=1 Reduce output records=1 Spilled Records=2 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=37 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=250560512 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=123 File Output Format Counters Bytes Written=23OK,总算对了。
相关推荐
**Hadoop学习笔记详解** Hadoop是一个开源的分布式计算框架,由Apache基金会开发,主要用于处理和存储海量数据。它的核心组件包括HDFS(Hadoop Distributed File System)和MapReduce,两者构成了大数据处理的基础...
Hadoop集群安装笔记是一篇详细的安装指南,旨在帮助新手快速搭建Hadoop学习环境。以下是该笔记中的重要知识点: Hadoop集群安装目录 在安装Hadoop集群之前,需要准备好安装环境。安装环境包括Java Development Kit...
在单节点上搭建Hadoop的伪分布式环境,用于测试和学习。这通常是在没有多台物理机的情况下进行的,所有Hadoop进程都在同一台机器上运行。步骤包括下载和安装JDK,配置Hadoop环境变量,解压和格式化HDFS,启动Hadoop...
二、Hadoop学习笔记之五:使用Eclipse插件 Eclipse插件是开发Hadoop应用的重要工具,它提供了集成的开发环境,使得开发者可以更方便地编写、调试和运行Hadoop程序。通过插件,用户可以创建Hadoop项目,编写MapReduce...
Hadoop还涉及文件存储的模型,包括文件的线性切割、块的偏移量、块在集群节点中的分布存储、副本数设置以及追加数据的方式。Hadoop架构强调主从模式,即NameNode作为主节点管理元数据,DataNode作为从节点存储数据块...
【HADOOP学习笔记】 Hadoop是Apache基金会开发的一个开源分布式计算框架,是云计算领域的重要组成部分,尤其在大数据处理方面有着广泛的应用。本学习笔记将深入探讨Hadoop的核心组件、架构以及如何搭建云计算平台。...
Hadoop集群可以轻松扩展到PB级别的数据存储和处理能力,这使得Hadoop非常适合在需要处理大量数据的场合中使用。但是,Hadoop的可扩展性并不意味着它适合所有场景,由于其架构特点,它更适合于批处理,而非实时处理。...
在本篇"Hadoop学习笔记(三)"中,我们将探讨如何使用Hadoop的MapReduce框架来解决一个常见的问题——从大量数据中找出最大值。这个问题与SQL中的`SELECT MAX(NUMBER) FROM TABLE`查询相似,但在这里我们通过编程...
这个“Hadoop学习笔记”涵盖了Hadoop生态系统中的核心组件,包括HDFS(Hadoop分布式文件系统)、HBase(一个分布式、列式存储的数据库)、Hive(数据仓库工具)以及Spark(一个快速、通用且可扩展的数据处理引擎)。...
《深入理解Hadoop分布式系统》 ...Hadoop的学习是一个逐步深入的过程,涵盖分布式存储、计算模型、资源调度等多个方面,理解其工作原理和最佳实践,对于在大数据环境中构建高效稳定的系统至关重要。
《Hadoop学习笔记》 Hadoop,作为大数据处理的核心框架,是开源社区的杰作,由Apache软件基金会维护。这份文档旨在深入解析Hadoop的基本概念、架构及其在大数据处理中的应用,帮助读者全面掌握这一重要技术。 一、...
本笔记将深入探讨Hadoop集群及其核心组件,帮助读者理解大数据平台的基础与精髓。 第一章介绍了大数据的基本概念。大数据是指无法用传统数据处理方法进行有效管理的数据集,其特点包括高容量、高速度和多样性。...
### 配置XEN环境及Hadoop集群环境学习笔记 #### XEN虚拟机的安装配置 **XEN** 是一种开源虚拟化技术,允许在一台物理机器上运行多个操作系统实例,这些实例通常被称为“域”(Domains)。XEN 的安装配置涉及到安装...
在这个学习笔记中,你可能会找到关于Hadoop集群的监控、故障排查、数据备份和恢复等方面的内容。这些都是确保Hadoop系统稳定运行的关键技能。此外,随着Hadoop生态系统的发展,如Hive(用于数据仓库)、Pig(提供...
集群的管理和配置是Hadoop学习中的重要环节,包括节点间通信设置、资源调度策略等。 3. **MapReduce**: MapReduce是Hadoop的并行计算模型,它将大型数据集分割成小块,并在多台机器上并行处理。Map阶段将原始数据...