`
qindongliang1922
  • 浏览: 2183992 次
  • 性别: Icon_minigender_1
  • 来自: 北京
博客专栏
7265517b-f87e-3137-b62c-5c6e30e26109
证道Lucene4
浏览量:117536
097be4a0-491e-39c0-89ff-3456fadf8262
证道Hadoop
浏览量:125922
41c37529-f6d8-32e4-8563-3b42b2712a50
证道shell编程
浏览量:59912
43832365-bc15-3f5d-b3cd-c9161722a70c
ELK修真
浏览量:71301
社区版块
存档分类
最新评论

Eclipse4.2向hadoop2.2提交MR作业异常

阅读更多
之前散仙也用过eclipse直接向hadoop提交MR作业,也提交成功过,这次换了集群环境,提交作业时发现几个异常,特此整理一下,以防后面再出现类似问题。

主要的问题的有2个:
第一个问题,在win7上的eclipse向hadoop提交作业时,没有权限,异常信息如下:


Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=qindongliang, access=EXECUTE, inode="/tmp":search:supergroup:drwx------
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:187)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:150)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5185)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5167)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOwner(FSNamesystem.java:5123)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermissionInt(FSNamesystem.java:1338)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1317)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:528)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setPermission(ClientNamenodeProtocolServerSideTranslatorPB.java:348)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59576)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

	at org.apache.hadoop.ipc.Client.call(Client.java:1347)
	at org.apache.hadoop.ipc.Client.call(Client.java:1300)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at $Proxy9.setPermission(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at $Proxy9.setPermission(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setPermission(ClientNamenodeProtocolTranslatorPB.java:277)
	at org.apache.hadoop.hdfs.DFSClient.setPermission(DFSClient.java:2045)
	... 16 more


第二个问题是提交上的MR作业,久久不能开始执行,但是如果随机提交到master上执行,那么可以正常执行,如果提交到slave机器上,那么就会一直处于阻塞状态,日志信息如下:
2014-10-31 17:48:08,453 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE job_1414748532081_0002
2014-10-31 17:48:08,457 INFO [Socket Reader #1 for port 37494] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 37494
2014-10-31 17:48:08,465 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2014-10-31 17:48:08,468 INFO [IPC Server listener on 37494] org.apache.hadoop.ipc.Server: IPC Server listener on 37494: starting
2014-10-31 17:48:08,504 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2014-10-31 17:48:08,504 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2014-10-31 17:48:08,504 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2014-10-31 17:48:08,560 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2014-10-31 17:48:14,580 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:15,583 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:16,587 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:17,590 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:18,592 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:19,595 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:20,597 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:21,602 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:22,606 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:23,608 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:54,621 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:55,624 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:56,626 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:57,628 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:58,631 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:48:59,633 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:00,635 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:01,638 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:02,641 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:03,643 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:34,653 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:35,655 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:36,657 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is Re



下面开始详细解说这两个问题的原因:
第一个问题由于没有权限的问题,异常信息提示很明确,举个例子,如果你在linux上使用的hadoop账户装的hadoop(关于hadoop的用户名可以在core-site.xml里面配置)

<property>  
    <name>hadoop.http.staticuser.user</name>  
    <value>hadoop</value>  
</property>  



ok,是个hadoop账户,如果你在win7上提交,默认使用的用户名是你机器的名字,就如本例一样,散仙的机器名叫qindongliang,所以在提交任务时,hadoop权限认证就发现,有别的用户向这里提交作业,然后再MR还没跑起来时,就直接拒绝验证通过了,所以就出现了文章开始前的那个错误,好了,知道原因了,我们就该思考一下如何解决这个问题:

方法主要有6种:
(1),更改linux上hadoop集群的名字为qindongliang
(2),更改hadoop的hdfs所在的目录的权限为hadoop fs -chmod 777 /user/hadoop
(3),关闭HDFS的权限认证机制,将dfs.permissions修改为False(经测试,无效)
(4),更改Windows7的系统用户名为hadoop
(5),在Win7上的环境变量中加入HADOOP_USER_NAME并配置在linux上对应的用户名即可
(6),在提交程序里通过代码临时设置指定HADOOP_USER_NAME的名字和linux上的一致



分析上面的方法,发现,前两种是操作linux改变,相当于操作服务端,后面3种,是操作的客户端windows7,抱着能不改变服务端的原则,就不改变,推荐在客户端更改,散仙用的是最后一种方法在程序指定用户名,如果大家觉得麻烦,可以直接在环境变量里,更改,不过更改后需要重启eclipse,当然你就可以永久使用这个名字,作为hadoop的提交名了。


散仙在程序指定hadoop的用户名比较灵活,代码如下:


System.setProperty("HADOOP_USER_NAME", "hadoop");


上面的这段代码加在main方法的第一行即可

下面看下第二个问题,具体的描述如下:
当 MR ApplicationMaster在master机器上启动时,MR程序跑得很好。
当 MR ApplicationMaster在slave机器上启动时,MR程序僵住。
不会显示任何的MapReduce执行进度,而且查看各个log信息,没有错误的提示,有的只是一直打印,如下的info信息:


2014-10-31 17:49:38,661 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:39,663 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:40,665 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:41,668 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:42,671 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:49:43,673 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:14,684 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:15,687 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:16,689 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:17,691 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:18,692 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:19,695 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:20,699 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:21,702 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:22,705 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:23,707 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:54,717 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:55,719 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:56,721 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-10-31 17:50:57,723 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry polic

上面的这个信息是由于host解析导致的,解决办法如下,在提交的代码里,加上如下代码:
conf.set("yarn.resourcemanager.scheduler.address", "192.168.223.163:8030"); 


如果没有注入调度地址,NodeManager会默认为0.0.0.0:8030。如果MR ApplicationMaster在 master机器上启动,0.0.0.0:8030 对应的调度器地址 恰好在本机;否则,在slave机器上0.0.0.0:8030 是找不到调度器的,因为调度器必须在master机器上。

知道了,这个原因,我们在代码里加上调度器的连接地址,即可!



2
0
分享到:
评论

相关推荐

    hadoop 2.2 安装包

    Hadoop 2.2 是一个重要的版本,它在Hadoop生态系统中引入了多项改进和优化,使得大数据处理变得更加高效和可靠。在这个版本中,Hadoop增强了其分布式存储系统HDFS(Hadoop Distributed File System)以及分布式计算...

    hadoop2.2 64位 (下)

    hadoop2.2 64位 (下) centos6.4 64位编译 这是下半部分

    hadoop2.2 eclipse插件编译

    标题中的“hadoop2.2 eclipse插件编译”意味着我们要讨论的是如何在Eclipse中编译适用于Hadoop 2.2版本的插件。这个过程通常涉及到下载源代码、配置构建环境以及执行编译命令。 描述中提到的“hadoop 2.x插件编译所...

    hadoop 2.2 eclipse plugins 插件

    hadoop 2.2 eclipse plugins 插件 拷贝至plugins即可 留给自己的,当做备份用

    Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建

    Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建 Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建是大数据处理和存储的重要组件,本文档将指导用户从零开始搭建一个完整的Hadoop2.2+Zookeeper3.4.5+HBase0.96集群...

    Hadoop2.2 安装手册

    《Hadoop 2.2 安装手册》是针对大数据处理框架Hadoop 2.2的详尽安装指南,特别适合初学者和有经验的IT专业人员参考。这份手册不仅包含了文字描述,还配有完整的截图,使安装过程更加直观易懂。在本文中,我们将深入...

    hadoop2.2集群搭建

    hadoop2.2集群环境搭建,按照本文档操作,可以搭建hadoop2.2的环境,从而进行大数据学习

    通过eclipse项目编译 hadoop 1.0.3 eclipse 4.2 ( juno ) plugin

    在本主题中,我们将深入探讨如何使用Eclipse IDE(版本4.2,也称为Juno)来编译Hadoop 1.0.3项目。Eclipse是一款强大的Java开发工具,而Hadoop则是一个分布式计算框架,广泛应用于大数据处理。通过集成Eclipse的插件...

    hadoop2.2-64-native包(redhat6.3x64)

    标题中的"hadoop2.2-64-native包(redhat6.3x64)"指的是一个针对Red Hat Enterprise Linux 6.3 64位系统的Hadoop 2.2版本的本地库(native libraries)包。这个包是用户从源代码编译而来的,通常包含了Hadoop运行时所需...

    hadoop2.2简易安装工具

    hadoop2.2 安装 工具 hive hbase快速安装工具

    hadoop2.2 下hive的安装

    在本文中,我们将深入探讨如何在Hadoop 2.2环境下安装Hive。Hive是Apache软件基金会开发的一个数据仓库工具,它允许用户通过SQL-like查询语言(HQL)来处理存储在Hadoop分布式文件系统(HDFS)中的大数据集。在...

    hadoop2.2在window7 sp1 32位系统中运行所需要的文件

    首先,我们关注的标题是“hadoop2.2在window7 sp1 32位系统中运行所需要的文件”。这意味着我们要在32位Windows环境下搭建Hadoop 2.2环境。通常,在Windows上安装Hadoop会遇到一些挑战,因为官方主要支持Linux平台。...

    eclipse连接hadoop所需要的hadoop.ddl和eclipse插件和hadoop运行案例

    这个插件提供了与Hadoop集群交互的功能,例如创建、提交和监控MapReduce作业。它简化了开发过程,允许开发者在Eclipse环境中直接管理Hadoop项目,浏览HDFS文件系统,甚至直接在IDE内编写、测试和调试作业。 安装...

    Hadoop2.2部署文档

    《Hadoop2.2部署指南》 在当前的数字化时代,大数据处理已成为企业的重要需求,而Hadoop作为开源的大数据处理框架,因其高效、可扩展的特性,深受业界青睐。本指南将详细介绍如何在Red Hat Enterprise Linux Server...

    hadoop2.2编译安装详解

    在本文中,我们将深入探讨Hadoop 2.2的编译和安装过程,这是一个广泛用于大数据处理和存储的开源框架。Hadoop的核心组件包括HDFS(Hadoop分布式文件系统)和MapReduce,它们共同构建了一个可扩展、容错性强的大数据...

    (转经修改)ganglia监控HADOOP 2.2

    在Hadoop 2.2中,Ganglia监控可以提供丰富的性能指标,如CPU利用率、内存使用、磁盘I/O、网络流量以及Hadoop特有的指标,如作业执行时间、任务进度等。这些数据对于优化Hadoop集群的性能、预防故障和规划扩展都至关...

    hadoop2.2 api

    开发者在使用Hadoop 2.2 API时,需要了解如何配置Hadoop环境,编写Map和Reduce函数,设置输入输出格式,以及如何提交和监控作业。此外,理解YARN的资源管理和调度策略也是高效开发的关键。 总之,Hadoop 2.2 API是...

    hadoop2.2伪分布式集群搭建

    hadoop2.2伪分布式集群搭建 #查看防火墙状态 service iptables status #关闭防火墙 service iptables stop #查看防火墙开机启动状态 chkconfig iptables --list #关闭防火墙开机启动 chkconfig iptables off

    hadoop2.2+spark集群搭建手记之hadoop集群遇到的各种问题

    hadoop2.2集群搭建遇到的各种问题。

    hadoop 2.2 64位 二进制包(上)

    hadoop2.2 64位 centos6.4 64位编译 这是上半部分

Global site tag (gtag.js) - Google Analytics