如何在win7下的eclipse中调试Hadoop2.2.0的程序

全部 Ruby Python PHP Flash C++ .net Rails Flex C C# Django

浏览 4041 次

锁定老帖子主题：如何在win7下的eclipse中调试Hadoop2.2.0的程序精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
作者	正文
qindongliang1922 等级: 性别: 文章: 170 积分: 840 来自: 北京	发表时间：2014-06-10 相关推荐: 如何在win7下的eclipse中以local模式调试Hadoop2.2.0的程序 Hadoop 2.0源码阅读环境配置--win7+Eclipse+Hadoop2.2.0 搭建Hadoop2.6.0+Eclipse开发调试环境 Win7 Eclipse调试Centos Hadoop2.2-Mapreduce Win7 Eclipse调试Centos Hadoop2.2-Mapreduce(转) 更多相关推荐 Java Eclipse Hadoop 在上一篇博文中，散仙已经讲了Hadoop的单机伪分布的部署，本篇，散仙就说下，如何eclipse中调试hadoop2.2.0,如果你使用的还是hadoop1.x的版本，那么，也没事，散仙在以前的博客里，也写过eclipse调试1.x的hadoop程序，两者最大的不同之处在于使用的eclipse插件不同，hadoop2.x与hadoop1.x的API，不太一致，所以插件也不一样，我们只需要使用分别对应的插件即可. 下面开始进入正题: <table class="bbcode"><tr><td>序号</td><td>名称</td><td>描述<tr><td>1</td><td>eclipse</td><td> Juno Service Release 4.2的本<tr><td>2</td><td>操作系统</td><td>Windows7<tr><td>3</td><td>hadoop的eclipse插件</td><td>hadoop-eclipse-plugin-2.2.0.jar<tr><td>4</td><td>hadoop的集群环境</td><td>虚拟机Linux的Centos6.5单机伪分布式<tr><td>5</td><td>调试程序</td><td>Hellow World</table> 遇到的几个问题如下： <pre name="code" class="java">java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. </pre> 解决办法: 在org.apache.hadoop.util.Shell类的checkHadoopHome()方法的返回值里写固定的本机hadoop的路径，散仙在这里更改如下： <pre name="code" class="java"> private static String checkHadoopHome() { // first check the Dflag hadoop.home.dir with JVM scope //System.setProperty("hadoop.home.dir", "..."); String home = System.getProperty("hadoop.home.dir"); // fall back to the system/user-global env variable if (home == null) { home = System.getenv("HADOOP_HOME"); } try { // couldn't find either setting for hadoop's home directory if (home == null) { throw new IOException("HADOOP_HOME or hadoop.home.dir are not set."); } if (home.startsWith("\"") && home.endsWith("\"")) { home = home.substring(1, home.length()-1); } // check that the home setting is actually a directory that exists File homedir = new File(home); if (!homedir.isAbsolute() \|\| !homedir.exists() \|\| !homedir.isDirectory()) { throw new IOException("Hadoop home directory " + homedir + " does not exist, is not a directory, or is not an absolute path."); } home = homedir.getCanonicalPath(); } catch (IOException ioe) { if (LOG.isDebugEnabled()) { LOG.debug("Failed to detect a valid hadoop home directory", ioe); } home = null; } //固定本机的hadoop地址 home="D:\\hadoop-2.2.0"; return home; }</pre> 第二个异常，Could not locate executable D:\Hadoop\tar\hadoop-2.2.0\hadoop-2.2.0\bin\winutils.exe in the Hadoop binaries. 找不到win上的执行程序，可以去https://github.com/srccodes/hadoop-common-2.2.0-bin下载bin包，覆盖本机的hadoop跟目录下的bin包即可第三个异常： <pre name="code" class="java">Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://192.168.130.54:19000/user/hmail/output/part-00000, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:47) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:357) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:125) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) at com.netease.hadoop.HDFSCatWithAPI.main(HDFSCatWithAPI.java:23) </pre> 出现这个异常，一般是HDFS的路径写的有问题，解决办法，拷贝集群上的core-site.xml和hdfs-site.xml文件，放在eclipse的src根目录下即可。第四个异常： <pre name="code" class="java">Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z </pre> 出现这个异常，一般是由于HADOOP_HOME的环境变量配置的有问题，在这里散仙特别说明一下，如果想在Win上的eclipse中成功调试Hadoop2.2，就需要在本机的环境变量上，添加如下的环境变量：（1）在系统变量中，新建HADOOP_HOME变量，属性值为D:\hadoop-2.2.0.也就是本机对应的hadoop目录 (2)在系统变量的Path里，追加%HADOOP_HOME%/bin即可以上的问题，是散仙在测试遇到的，经过对症下药，我们的eclipse终于可以成功的调试MR程序了，散仙这里的Hellow World源码如下： <pre name="code" class="java">package com.qin.wordcount; import java.io.IOException; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; /*** * * Hadoop2.2.0测试 * 放WordCount的例子 * * @author qindongliang * * hadoop技术交流群： 376932160 * * * / public class MyWordCount { /* * Mapper * * / private static class WMapper extends Mapper<LongWritable, Text, Text, IntWritable>{ private IntWritable count=new IntWritable(1); private Text text=new Text(); @Override protected void map(LongWritable key, Text value,Context context) throws IOException, InterruptedException { String values[]=value.toString().split("#"); //System.out.println(values[0]+"========"+values[1]); count.set(Integer.parseInt(values[1])); text.set(values[0]); context.write(text,count); } } / * Reducer * * / private static class WReducer extends Reducer<Text, IntWritable, Text, Text>{ private Text t=new Text(); @Override protected void reduce(Text key, Iterable<IntWritable> value,Context context) throws IOException, InterruptedException { int count=0; for(IntWritable i:value){ count+=i.get(); } t.set(count+""); context.write(key,t); } } / * 改动一 * (1)shell源码里添加checkHadoopHome的路径 * (2)974行，FileUtils里面 * / public static void main(String[] args) throws Exception{ // String path1=System.getenv("HADOOP_HOME"); // System.out.println(path1); // System.exit(0); JobConf conf=new JobConf(MyWordCount.class); //Configuration conf=new Configuration(); //conf.set("mapred.job.tracker","192.168.75.130:9001"); //读取person中的数据字段 // conf.setJar("tt.jar"); //注意这行代码放在最前面，进行初始化，否则会报 /Job任务**/ Job job=new Job(conf, "testwordcount"); job.setJarByClass(MyWordCount.class); System.out.println("模式： "+conf.get("mapred.job.tracker"));; // job.setCombinerClass(PCombine.class); // job.setNumReduceTasks(3);//设置为3 job.setMapperClass(WMapper.class); job.setReducerClass(WReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); String path="hdfs://192.168.46.28:9000/qin/output"; FileSystem fs=FileSystem.get(conf); Path p=new Path(path); if(fs.exists(p)){ fs.delete(p, true); System.out.println("输出路径存在，已删除！"); } FileInputFormat.setInputPaths(job, "hdfs://192.168.46.28:9000/qin/input"); FileOutputFormat.setOutputPath(job,p ); System.exit(job.waitForCompletion(true) ? 0 : 1); } } </pre> 控制台，打印日志如下： <pre name="code" class="java">INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 模式： local 输出路径存在，已删除！ INFO - Configuration.warnOnceIfDeprecated(840) \| session.id is deprecated. Instead, use dfs.metrics.session-id INFO - JvmMetrics.init(76) \| Initializing JVM Metrics with processName=JobTracker, sessionId= WARN - JobSubmitter.copyAndConfigureFiles(149) \| Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. WARN - JobSubmitter.copyAndConfigureFiles(258) \| No job jar file set. User classes may not be found. See Job or Job#setJar(String). INFO - FileInputFormat.listStatus(287) \| Total input paths to process : 1 INFO - JobSubmitter.submitJobInternal(394) \| number of splits:1 INFO - Configuration.warnOnceIfDeprecated(840) \| user.name is deprecated. Instead, use mapreduce.job.user.name INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.job.name is deprecated. Instead, use mapreduce.job.name INFO - Configuration.warnOnceIfDeprecated(840) \| mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir INFO - Configuration.warnOnceIfDeprecated(840) \| mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir INFO - JobSubmitter.printTokens(477) \| Submitting tokens for job: job_local1181216011_0001 WARN - Configuration.loadProperty(2172) \| file:/root/hadoop/tmp/mapred/staging/qindongliang1181216011/.staging/job_local1181216011_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. WARN - Configuration.loadProperty(2172) \| file:/root/hadoop/tmp/mapred/staging/qindongliang1181216011/.staging/job_local1181216011_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. WARN - Configuration.loadProperty(2172) \| file:/root/hadoop/tmp/mapred/local/localRunner/qindongliang/job_local1181216011_0001/job_local1181216011_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. WARN - Configuration.loadProperty(2172) \| file:/root/hadoop/tmp/mapred/local/localRunner/qindongliang/job_local1181216011_0001/job_local1181216011_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. INFO - Job.submit(1272) \| The url to track the job: http://localhost:8080/ INFO - Job.monitorAndPrintJob(1317) \| Running job: job_local1181216011_0001 INFO - LocalJobRunner$Job.createOutputCommitter(323) \| OutputCommitter set in config null INFO - LocalJobRunner$Job.createOutputCommitter(341) \| OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter INFO - LocalJobRunner$Job.run(389) \| Waiting for map tasks INFO - LocalJobRunner$Job$MapTaskRunnable.run(216) \| Starting task: attempt_local1181216011_0001_m_000000_0 INFO - ProcfsBasedProcessTree.isAvailable(129) \| ProcfsBasedProcessTree currently is supported only on Linux. INFO - Task.initialize(581) \| Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@39550640 INFO - MapTask.runNewMapper(732) \| Processing split: hdfs://192.168.46.28:9000/qin/input/test.txt:0+38 INFO - MapTask.createSortingCollector(387) \| Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer INFO - MapTask$MapOutputBuffer.setEquator(1183) \| (EQUATOR) 0 kvi 26214396(104857584) INFO - MapTask$MapOutputBuffer.init(975) \| mapreduce.task.io.sort.mb: 100 INFO - MapTask$MapOutputBuffer.init(976) \| soft limit at 83886080 INFO - MapTask$MapOutputBuffer.init(977) \| bufstart = 0; bufvoid = 104857600 INFO - MapTask$MapOutputBuffer.init(978) \| kvstart = 26214396; length = 6553600 INFO - LocalJobRunner$Job.statusUpdate(513) \| INFO - MapTask$MapOutputBuffer.flush(1440) \| Starting flush of map output INFO - MapTask$MapOutputBuffer.flush(1459) \| Spilling map output INFO - MapTask$MapOutputBuffer.flush(1460) \| bufstart = 0; bufend = 44; bufvoid = 104857600 INFO - MapTask$MapOutputBuffer.flush(1462) \| kvstart = 26214396(104857584); kvend = 26214384(104857536); length = 13/6553600 INFO - MapTask$MapOutputBuffer.sortAndSpill(1648) \| Finished spill 0 INFO - Task.done(995) \| Task:attempt_local1181216011_0001_m_000000_0 is done. And is in the process of committing INFO - LocalJobRunner$Job.statusUpdate(513) \| map INFO - Task.sendDone(1115) \| Task 'attempt_local1181216011_0001_m_000000_0' done. INFO - LocalJobRunner$Job$MapTaskRunnable.run(241) \| Finishing task: attempt_local1181216011_0001_m_000000_0 INFO - LocalJobRunner$Job.run(397) \| Map task executor complete. INFO - ProcfsBasedProcessTree.isAvailable(129) \| ProcfsBasedProcessTree currently is supported only on Linux. INFO - Task.initialize(581) \| Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@68843e7b INFO - Merger$MergeQueue.merge(568) \| Merging 1 sorted segments INFO - Merger$MergeQueue.merge(667) \| Down to the last merge-pass, with 1 segments left of total size: 45 bytes INFO - LocalJobRunner$Job.statusUpdate(513) \| INFO - Configuration.warnOnceIfDeprecated(840) \| mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords INFO - Task.done(995) \| Task:attempt_local1181216011_0001_r_000000_0 is done. And is in the process of committing INFO - LocalJobRunner$Job.statusUpdate(513) \| INFO - Task.commit(1156) \| Task attempt_local1181216011_0001_r_000000_0 is allowed to commit now INFO - FileOutputCommitter.commitTask(439) \| Saved output of task 'attempt_local1181216011_0001_r_000000_0' to hdfs://192.168.46.28:9000/qin/output/_temporary/0/task_local1181216011_0001_r_000000 INFO - LocalJobRunner$Job.statusUpdate(513) \| reduce > reduce INFO - Task.sendDone(1115) \| Task 'attempt_local1181216011_0001_r_000000_0' done. INFO - Job.monitorAndPrintJob(1338) \| Job job_local1181216011_0001 running in uber mode : false INFO - Job.monitorAndPrintJob(1345) \| map 100% reduce 100% INFO - Job.monitorAndPrintJob(1356) \| Job job_local1181216011_0001 completed successfully INFO - Job.monitorAndPrintJob(1363) \| Counters: 32 File System Counters FILE: Number of bytes read=372 FILE: Number of bytes written=382174 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=76 HDFS: Number of bytes written=27 HDFS: Number of read operations=17 HDFS: Number of large read operations=0 HDFS: Number of write operations=6 Map-Reduce Framework Map input records=4 Map output records=4 Map output bytes=44 Map output materialized bytes=58 Input split bytes=109 Combine input records=0 Combine output records=0 Reduce input groups=3 Reduce shuffle bytes=0 Reduce input records=4 Reduce output records=3 Spilled Records=8 Shuffled Maps =0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=0 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=532938752 File Input Format Counters Bytes Read=38 File Output Format Counters Bytes Written=27 </pre> 输入的测试数据如下： <pre name="code" class="java">中国#1 美国#2 英国#3 中国#2</pre> 输出的结果如下： <pre name="code" class="java">中国 3 美国 2 英国 3 </pre> 至此，我们已经成功的在eclipse里远程调试hadoop成功，调试时，注意散仙，在上文提出的几个问题，如果遇到时，按照对应的方法解决即可。 hadoop-common-2.2.0-bin-master.zip (272.6 KB) 下载次数: 26 声明：ITeye文章版权属于作者，受法律保护。没有作者书面许可不得转载。推荐链接
返回顶楼

cectsky 等级: 初级会员性别: 文章: 915 积分: 40 来自: 哈尔滨	发表时间：2014-06-11 有空trytry
返回顶楼	回帖地址 0 0 请登录后投票

论坛首页 → 编程语言技术版

跳转论坛: