已经在eclipse中通过local的模式可以正确的调试hadoop2.2,那么本篇,散仙将重点叙述下,如何在eclipse中,真真正正的提交作业到yarn上,开启分布式模式的调试,通过在eclipse上调试,hadoop的MapReduce程序,可以使我们学习Hadoop更加容易,清晰。
如果没有看过,散仙的如何在eclipse中使用local模式调试hadoop的文章,可以先看下上篇,熟悉下基本的问题的解决。
下面进入正题,由于散仙在上篇中,已经使用eclipse成功的使用了local模式的调试,所以本次改成分布式模式的调试,也不算太困难。使用eclipse作为客户端像yarn集群上提交作业,需要将整个项目打包成一个jar,散仙在这里使用的是一个ant脚本,文章最后,散仙会附上来,直接遇到的最大的一个问题如下异常:
- 2014-06-11 17:32:19,761 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1401177251807_0034_01_000001 and exit code: 1
- org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control
- at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
- at org.apache.hadoop.util.Shell.run(Shell.java:418)
- at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
- at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
- at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
- at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
- at java.util.concurrent.FutureTask.run(FutureTask.java:262)
- at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
- at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
- at java.lang.Thread.run(Thread.java:744)
2014-06-11 17:32:19,761 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1401177251807_0034_01_000001 and exit code: 1 org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744)
这个问题,在网上已经得到解决,需要下2个patch包,进行打补丁,比较繁琐,散仙,在参考了这位兄弟的文章后,http://blog.csdn.net/fansy1990/article/details/27526167
感觉使用方法解决,比较简洁方便。引起上述异常的主要原因就是,Linux和Windows的环境变量符号不一致导致的问题win上是%而linux上是$所以直接导致了上述原因,当然这个问题再linux上的eclipse是不存在,只有在win上的eclipse中,才会出现,所以我们要做的就是,改变org.apache.hadoop.mapred.YARNRunner类里面的一些方法,来消除此异常。
具体步骤,改写YARNRunner源码中的一些方法(YARNRunner.java源码类在hadoop-mapreduce-client-jobclient的maven项目中的org.apache.hadoop.mapred包下)需要在src下建同样的包名,类名,覆盖原来jar包里面自带的类。
YarnRunner.java的390行 (Apache Hadoop2.2的源码)
- // Setup the command to run the AM
- List<String> vargs = new ArrayList<String>(8);
- vargs.add(Environment.JAVA_HOME.$() + "/bin/java");
// Setup the command to run the AM List<String> vargs = new ArrayList<String>(8); vargs.add(Environment.JAVA_HOME.$() + "/bin/java");
改为
- vargs.add("$JAVA_HOME/bin/java");
vargs.add("$JAVA_HOME/bin/java");
在YarnRunner.java类中,新增一个路径转换的方法
- private void replaceEnvironment(Map<String, String> environment) {
- String tmpClassPath = environment.get("CLASSPATH");
- tmpClassPath=tmpClassPath.replaceAll(";", ":");
- tmpClassPath=tmpClassPath.replaceAll("%PWD%", "\\$PWD");
- tmpClassPath=tmpClassPath.replaceAll("%HADOOP_MAPRED_HOME%", "\\$HADOOP_MAPRED_HOME");
- tmpClassPath= tmpClassPath.replaceAll("\\\\", "/" );
- environment.put("CLASSPATH",tmpClassPath);
- }
private void replaceEnvironment(Map<String, String> environment) { String tmpClassPath = environment.get("CLASSPATH"); tmpClassPath=tmpClassPath.replaceAll(";", ":"); tmpClassPath=tmpClassPath.replaceAll("%PWD%", "\\$PWD"); tmpClassPath=tmpClassPath.replaceAll("%HADOOP_MAPRED_HOME%", "\\$HADOOP_MAPRED_HOME"); tmpClassPath= tmpClassPath.replaceAll("\\\\", "/" ); environment.put("CLASSPATH",tmpClassPath); }
在YarnRunner.java的在466行添加:
- replaceEnvironment(environment);
replaceEnvironment(environment);
通过,这样设置后,原来的异常就得到解决了,散仙在这里分布式测试的例子依旧是hellow world,源码如下:
- package com.qin.wordcount;
- import java.io.IOException;
- import org.apache.hadoop.conf.Configuration;
- import org.apache.hadoop.fs.FileSystem;
- import org.apache.hadoop.fs.Path;
- import org.apache.hadoop.io.IntWritable;
- import org.apache.hadoop.io.LongWritable;
- import org.apache.hadoop.io.Text;
- import org.apache.hadoop.mapred.JobConf;
- import org.apache.hadoop.mapred.YARNRunner;
- import org.apache.hadoop.mapreduce.Job;
- import org.apache.hadoop.mapreduce.Mapper;
- import org.apache.hadoop.mapreduce.Reducer;
- import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
- import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
- import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
- import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
- /***
- *
- * Hadoop2.2.0完全分布式测试
- * 放WordCount的例子
- *
- * @author qindongliang
- *
- * hadoop技术交流群: 376932160
- *
- *
- * */
- public class MyWordCount {
- /**
- * Mapper
- *
- * **/
- private static class WMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
- private IntWritable count=new IntWritable(1);
- private Text text=new Text();
- @Override
- protected void map(LongWritable key, Text value,Context context)
- throws IOException, InterruptedException {
- String values[]=value.toString().split("#");
- //System.out.println(values[0]+"========"+values[1]);
- count.set(Integer.parseInt(values[1]));
- text.set(values[0]);
- context.write(text,count);
- }
- }
- /**
- * Reducer
- *
- * **/
- private static class WReducer extends Reducer<Text, IntWritable, Text, Text>{
- private Text t=new Text();
- @Override
- protected void reduce(Text key, Iterable<IntWritable> value,Context context)
- throws IOException, InterruptedException {
- int count=0;
- for(IntWritable i:value){
- count+=i.get();
- }
- t.set(count+"");
- context.write(key,t);
- }
- }
- /**
- * 改动一
- * (1)shell源码里添加checkHadoopHome的路径
- * (2)974行,FileUtils里面
- * **/
- public static void main(String[] args) throws Exception{
- Configuration conf=new Configuration();
- conf.set("mapreduce.job.jar", "myjob.jar");
- conf.set("fs.defaultFS","hdfs://192.168.46.28:9000");
- conf.set("mapreduce.framework.name", "yarn");
- conf.set("yarn.resourcemanager.address", "192.168.46.28:8032");
- /**Job任务**/
- //Job job=new Job(conf, "testwordcount");//废弃此API
- Job job=Job.getInstance(conf, "new api");
- job.setJarByClass(MyWordCount.class);
- System.out.println("模式: "+conf.get("mapreduce.jobtracker.address"));;
- // job.setCombinerClass(PCombine.class);
- // job.setNumReduceTasks(3);//设置为3
- job.setMapperClass(WMapper.class);
- job.setReducerClass(WReducer.class);
- job.setInputFormatClass(TextInputFormat.class);
- job.setOutputFormatClass(TextOutputFormat.class);
- job.setMapOutputKeyClass(Text.class);
- job.setMapOutputValueClass(IntWritable.class);
- job.setOutputKeyClass(Text.class);
- job.setOutputValueClass(Text.class);
- String path="hdfs://192.168.46.28:9000/qin/output";
- FileSystem fs=FileSystem.get(conf);
- Path p=new Path(path);
- if(fs.exists(p)){
- fs.delete(p, true);
- System.out.println("输出路径存在,已删除!");
- }
- FileInputFormat.setInputPaths(job, "hdfs://192.168.46.28:9000/qin/input");
- FileOutputFormat.setOutputPath(job,p );
- System.exit(job.waitForCompletion(true) ? 0 : 1);
- }
- }
package com.qin.wordcount; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.YARNRunner; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; /*** * * Hadoop2.2.0完全分布式测试 * 放WordCount的例子 * * @author qindongliang * * hadoop技术交流群: 376932160 * * * */ public class MyWordCount { /** * Mapper * * **/ private static class WMapper extends Mapper<LongWritable, Text, Text, IntWritable>{ private IntWritable count=new IntWritable(1); private Text text=new Text(); @Override protected void map(LongWritable key, Text value,Context context) throws IOException, InterruptedException { String values[]=value.toString().split("#"); //System.out.println(values[0]+"========"+values[1]); count.set(Integer.parseInt(values[1])); text.set(values[0]); context.write(text,count); } } /** * Reducer * * **/ private static class WReducer extends Reducer<Text, IntWritable, Text, Text>{ private Text t=new Text(); @Override protected void reduce(Text key, Iterable<IntWritable> value,Context context) throws IOException, InterruptedException { int count=0; for(IntWritable i:value){ count+=i.get(); } t.set(count+""); context.write(key,t); } } /** * 改动一 * (1)shell源码里添加checkHadoopHome的路径 * (2)974行,FileUtils里面 * **/ public static void main(String[] args) throws Exception{ Configuration conf=new Configuration(); conf.set("mapreduce.job.jar", "myjob.jar"); conf.set("fs.defaultFS","hdfs://192.168.46.28:9000"); conf.set("mapreduce.framework.name", "yarn"); conf.set("yarn.resourcemanager.address", "192.168.46.28:8032"); /**Job任务**/ //Job job=new Job(conf, "testwordcount");//废弃此API Job job=Job.getInstance(conf, "new api"); job.setJarByClass(MyWordCount.class); System.out.println("模式: "+conf.get("mapreduce.jobtracker.address"));; // job.setCombinerClass(PCombine.class); // job.setNumReduceTasks(3);//设置为3 job.setMapperClass(WMapper.class); job.setReducerClass(WReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); String path="hdfs://192.168.46.28:9000/qin/output"; FileSystem fs=FileSystem.get(conf); Path p=new Path(path); if(fs.exists(p)){ fs.delete(p, true); System.out.println("输出路径存在,已删除!"); } FileInputFormat.setInputPaths(job, "hdfs://192.168.46.28:9000/qin/input"); FileOutputFormat.setOutputPath(job,p ); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
在运行的时候,需要注意把,hadoop集群上的配置文件core-site.xml,hdfs-site.xml,mapred-site.xml,yarn-site.xml文件拷贝到src的根目录下,最好也放一个log4j.xml方便查看日志。并在mapred-site.xml里面,添加如下属性:
- <name>mapred.remote.os</name>
- <value>Linux</value>
- <description>RemoteMapReduce framework's OS, can be either Linux orWindows</description>
- </property>
<name>mapred.remote.os</name> <value>Linux</value> <description>RemoteMapReduce framework's OS, can be either Linux orWindows</description> </property>
然后,把项目打成jar包,运行提交作业,散仙的控制台打印内容如下:
- 模式: hp1:8021
- 输出路径存在,已删除!
- INFO - RMProxy.createRMProxy(56) | Connecting to ResourceManager at /192.168.46.28:8032
- WARN - JobSubmitter.copyAndConfigureFiles(149) | Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
- INFO - FileInputFormat.listStatus(287) | Total input paths to process : 1
- INFO - JobSubmitter.submitJobInternal(394) | number of splits:1
- INFO - Configuration.warnOnceIfDeprecated(840) | user.name is deprecated. Instead, use mapreduce.job.user.name
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.jar is deprecated. Instead, use mapreduce.job.jar
- INFO - Configuration.warnOnceIfDeprecated(840) | fs.default.name is deprecated. Instead, use fs.defaultFS
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.job.name is deprecated. Instead, use mapreduce.job.name
- INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
- INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
- INFO - Configuration.warnOnceIfDeprecated(840) | mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
- INFO - JobSubmitter.printTokens(477) | Submitting tokens for job: job_1402492118962_0004
- INFO - YarnClientImpl.submitApplication(174) | Submitted application application_1402492118962_0004 to ResourceManager at /192.168.46.28:8032
- INFO - Job.submit(1272) | The url to track the job: http://hp1:8088/proxy/application_1402492118962_0004/
- INFO - Job.monitorAndPrintJob(1317) | Running job: job_1402492118962_0004
- INFO - Job.monitorAndPrintJob(1338) | Job job_1402492118962_0004 running in uber mode : false
- INFO - Job.monitorAndPrintJob(1345) | map 0% reduce 0%
- INFO - Job.monitorAndPrintJob(1345) | map 100% reduce 0%
- INFO - Job.monitorAndPrintJob(1345) | map 100% reduce 100%
- INFO - Job.monitorAndPrintJob(1356) | Job job_1402492118962_0004 completed successfully
- INFO - Job.monitorAndPrintJob(1363) | Counters: 43
- File System Counters
- FILE: Number of bytes read=58
- FILE: Number of bytes written=159667
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=147
- HDFS: Number of bytes written=27
- HDFS: Number of read operations=6
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=2
- Job Counters
- Launched map tasks=1
- Launched reduce tasks=1
- Data-local map tasks=1
- Total time spent by all maps in occupied slots (ms)=6155
- Total time spent by all reduces in occupied slots (ms)=4929
- Map-Reduce Framework
- Map input records=4
- Map output records=4
- Map output bytes=44
- Map output materialized bytes=58
- Input split bytes=109
- Combine input records=0
- Combine output records=0
- Reduce input groups=3
- Reduce shuffle bytes=58
- Reduce input records=4
- Reduce output records=3
- Spilled Records=8
- Shuffled Maps =1
- Failed Shuffles=0
- Merged Map outputs=1
- GC time elapsed (ms)=99
- CPU time spent (ms)=1060
- Physical memory (bytes) snapshot=309071872
- Virtual memory (bytes) snapshot=1680531456
- Total committed heap usage (bytes)=136450048
- Shuffle Errors
- BAD_ID=0
- CONNECTION=0
- IO_ERROR=0
- WRONG_LENGTH=0
- WRONG_MAP=0
- WRONG_REDUCE=0
- File Input Format Counters
- Bytes Read=38
- File Output Format Counters
- Bytes Written=27
模式: hp1:8021 输出路径存在,已删除! INFO - RMProxy.createRMProxy(56) | Connecting to ResourceManager at /192.168.46.28:8032 WARN - JobSubmitter.copyAndConfigureFiles(149) | Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. INFO - FileInputFormat.listStatus(287) | Total input paths to process : 1 INFO - JobSubmitter.submitJobInternal(394) | number of splits:1 INFO - Configuration.warnOnceIfDeprecated(840) | user.name is deprecated. Instead, use mapreduce.job.user.name INFO - Configuration.warnOnceIfDeprecated(840) | mapred.jar is deprecated. Instead, use mapreduce.job.jar INFO - Configuration.warnOnceIfDeprecated(840) | fs.default.name is deprecated. Instead, use fs.defaultFS INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.job.name is deprecated. Instead, use mapreduce.job.name INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir INFO - JobSubmitter.printTokens(477) | Submitting tokens for job: job_1402492118962_0004 INFO - YarnClientImpl.submitApplication(174) | Submitted application application_1402492118962_0004 to ResourceManager at /192.168.46.28:8032 INFO - Job.submit(1272) | The url to track the job: http://hp1:8088/proxy/application_1402492118962_0004/ INFO - Job.monitorAndPrintJob(1317) | Running job: job_1402492118962_0004 INFO - Job.monitorAndPrintJob(1338) | Job job_1402492118962_0004 running in uber mode : false INFO - Job.monitorAndPrintJob(1345) | map 0% reduce 0% INFO - Job.monitorAndPrintJob(1345) | map 100% reduce 0% INFO - Job.monitorAndPrintJob(1345) | map 100% reduce 100% INFO - Job.monitorAndPrintJob(1356) | Job job_1402492118962_0004 completed successfully INFO - Job.monitorAndPrintJob(1363) | Counters: 43 File System Counters FILE: Number of bytes read=58 FILE: Number of bytes written=159667 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=147 HDFS: Number of bytes written=27 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=6155 Total time spent by all reduces in occupied slots (ms)=4929 Map-Reduce Framework Map input records=4 Map output records=4 Map output bytes=44 Map output materialized bytes=58 Input split bytes=109 Combine input records=0 Combine output records=0 Reduce input groups=3 Reduce shuffle bytes=58 Reduce input records=4 Reduce output records=3 Spilled Records=8 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=99 CPU time spent (ms)=1060 Physical memory (bytes) snapshot=309071872 Virtual memory (bytes) snapshot=1680531456 Total committed heap usage (bytes)=136450048 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=38 File Output Format Counters Bytes Written=27
作业在8088界面上显示情况如下:
wordcount的执行结果,也正确,至此,我们的eclipse调试hadoop2.2分布式集群,已经成功了
相关推荐
为了方便Hadoop开发,Eclipse提供了一个名为"Hadoop Eclipse Plugin"的插件,使得开发者可以在Eclipse环境中直接进行Hadoop项目开发和管理。 标题中的“hadoop2.2 eclipse插件编译”意味着我们要讨论的是如何在...
Hadoop 2.2 是一个重要的版本,它在Hadoop生态系统中引入了多项改进和优化,使得大数据处理变得更加高效和可靠。在这个版本中,Hadoop增强了其分布式存储系统HDFS(Hadoop Distributed File System)以及分布式计算...
hadoop 2.2 eclipse plugins 插件 拷贝至plugins即可 留给自己的,当做备份用
hadoop2.2 64位 (下) centos6.4 64位编译 这是下半部分
在本文中,我们将深入探讨Hadoop 2.2的安装,包括单节点和集群部署两种模式。 首先,理解Hadoop是至关重要的。Hadoop是一个开源的分布式计算框架,由Apache基金会维护,旨在高效处理和存储大量数据。Hadoop 2.2是其...
Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建 Hadoop2.2+Zookeeper3.4.5+HBase...但是,通过本文档的指导,用户可以从零开始搭建一个完整的Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境,用于大数据处理和存储。
在本文中,我们将深入探讨如何在Hadoop 2.2环境下安装Hive。Hive是Apache软件基金会开发的一个数据仓库工具,它允许用户通过SQL-like查询语言(HQL)来处理存储在Hadoop分布式文件系统(HDFS)中的大数据集。在...
总的来说,这个"hadoop2.2-64-native包"是专为RHEL 6.3 64位系统定制的,包含了通过源代码编译得到的本地库文件,是Hadoop在该环境中高效运行的基础。对于系统管理员和Hadoop开发者来说,理解如何构建、安装和使用...
hadoop2.2集群环境搭建,按照本文档操作,可以搭建hadoop2.2的环境,从而进行大数据学习
不过,通过一些调整和特定的工具,可以在Windows上运行Hadoop。 描述中提到的“hadoop2.2 windows7 win32 winutils hadoop.dll”揭示了两个关键组件:winutils和hadoop.dll。`winutils.exe`是Hadoop在Windows上的...
Eclipse中远程调试Hadoop必备资料:hadoop-eclipse-plugin-1.1.1和hadoop-core-1.0.2-modified;已经在eclipse-jee-juno-SR1-win32-x86_64和hadoop1.1.1 下测试过。
在本文中,我们将深入探讨Hadoop 2.2的编译和安装过程,这是一个广泛用于大数据处理和存储的开源框架。Hadoop的核心组件包括HDFS(Hadoop分布式文件系统)和MapReduce,它们共同构建了一个可扩展、容错性强的大数据...
在Hadoop 2.2中,Ganglia监控可以提供丰富的性能指标,如CPU利用率、内存使用、磁盘I/O、网络流量以及Hadoop特有的指标,如作业执行时间、任务进度等。这些数据对于优化Hadoop集群的性能、预防故障和规划扩展都至关...
hadoop2.2 安装 工具 hive hbase快速安装工具
《Hadoop2.2部署指南》 在当前的数字化时代,大数据处理已成为企业的重要需求,而Hadoop作为开源的大数据处理框架,因其高效、可扩展的特性,深受业界青睐。本指南将详细介绍如何在Red Hat Enterprise Linux Server...
在Eclipse中,我们可以通过以下步骤创建Hadoop的远程调试配置: 1. **新建运行配置**:选择"Run Configurations",然后在左侧树形菜单中选择"Remote Java Application",点击"New Launch Configuration"。 2. **...
Hadoop 2.2安装部署手册是一份详细的指南,用于在集群上安装和配置Hadoop版本2.2以及相关生态系统组件。根据提供的文件信息,可以总结出以下知识点: 1. 集群规划与主机名修改: - 在集群规划时,首先需要修改各...
在Hadoop 2.2中,MapReduce经历了重大改进,包括YARN(Yet Another Resource Negotiator)的引入,将资源管理和任务调度分离,提高了系统的效率和灵活性。 2. **YARN**: YARN作为Hadoop的资源管理器,负责任务...
从内容中可以看出,本安装文档在整合Hadoop、HBase和Hive时提供了实践过程中的总结和注意事项,并提供了相关问题导读,这有助于用户在安装时减少错误和不必要的尝试。需要注意的是,由于是通过OCR扫描得到的文档内容...
hadoop2.2集群搭建遇到的各种问题。