浏览 4047 次
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
|
|
---|---|
作者 | 正文 |
发表时间:2014-06-10
在上一篇博文中,散仙已经讲了Hadoop的单机伪分布的部署,本篇,散仙就说下,如何eclipse中调试hadoop2.2.0,如果你使用的还是hadoop1.x的版本,那么,也没事,散仙在以前的博客里,也写过eclipse调试1.x的hadoop程序,两者最大的不同之处在于使用的eclipse插件不同,hadoop2.x与hadoop1.x的API,不太一致,所以插件也不一样,我们只需要使用分别对应的插件即可.
下面开始进入正题: <table class="bbcode"><tr><td>序号</td><td>名称</td><td>描述<tr><td>1</td><td>eclipse</td><td> Juno Service Release 4.2的本<tr><td>2</td><td>操作系统</td><td>Windows7<tr><td>3</td><td>hadoop的eclipse插件</td><td>hadoop-eclipse-plugin-2.2.0.jar<tr><td>4</td><td>hadoop的集群环境</td><td>虚拟机Linux的Centos6.5单机伪分布式<tr><td>5</td><td>调试程序</td><td>Hellow World</table> 遇到的几个问题如下: <pre name="code" class="java">java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. </pre> 解决办法: 在org.apache.hadoop.util.Shell类的checkHadoopHome()方法的返回值里写固定的 本机hadoop的路径,散仙在这里更改如下: <pre name="code" class="java"> private static String checkHadoopHome() { // first check the Dflag hadoop.home.dir with JVM scope //System.setProperty("hadoop.home.dir", "..."); String home = System.getProperty("hadoop.home.dir"); // fall back to the system/user-global env variable if (home == null) { home = System.getenv("HADOOP_HOME"); } try { // couldn't find either setting for hadoop's home directory if (home == null) { throw new IOException("HADOOP_HOME or hadoop.home.dir are not set."); } if (home.startsWith("\"") && home.endsWith("\"")) { home = home.substring(1, home.length()-1); } // check that the home setting is actually a directory that exists File homedir = new File(home); if (!homedir.isAbsolute() || !homedir.exists() || !homedir.isDirectory()) { throw new IOException("Hadoop home directory " + homedir + " does not exist, is not a directory, or is not an absolute path."); } home = homedir.getCanonicalPath(); } catch (IOException ioe) { if (LOG.isDebugEnabled()) { LOG.debug("Failed to detect a valid hadoop home directory", ioe); } home = null; } //固定本机的hadoop地址 home="D:\\hadoop-2.2.0"; return home; }</pre> 第二个异常,Could not locate executable D:\Hadoop\tar\hadoop-2.2.0\hadoop-2.2.0\bin\winutils.exe in the Hadoop binaries. 找不到win上的执行程序,可以去https://github.com/srccodes/hadoop-common-2.2.0-bin下载bin包,覆盖本机的hadoop跟目录下的bin包即可 第三个异常: <pre name="code" class="java">Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://192.168.130.54:19000/user/hmail/output/part-00000, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:47) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:357) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:125) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) at com.netease.hadoop.HDFSCatWithAPI.main(HDFSCatWithAPI.java:23) </pre> 出现这个异常,一般是HDFS的路径写的有问题,解决办法,拷贝集群上的core-site.xml和hdfs-site.xml文件,放在eclipse的src根目录下即可。 第四个异常: <pre name="code" class="java">Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z </pre> 出现这个异常,一般是由于HADOOP_HOME的环境变量配置的有问题,在这里散仙特别说明一下,如果想在Win上的eclipse中成功调试Hadoop2.2,就需要在本机的环境变量上,添加如下的环境变量: (1)在系统变量中,新建HADOOP_HOME变量,属性值为D:\hadoop-2.2.0.也就是本机对应的hadoop目录 (2)在系统变量的Path里,追加%HADOOP_HOME%/bin即可 以上的问题,是散仙在测试遇到的,经过对症下药,我们的eclipse终于可以成功的调试MR程序了,散仙这里的Hellow World源码如下: <pre name="code" class="java">package com.qin.wordcount; import java.io.IOException; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; /*** * * Hadoop2.2.0测试 * 放WordCount的例子 * * @author qindongliang * * hadoop技术交流群: 376932160 * * * */ public class MyWordCount { /** * Mapper * * **/ private static class WMapper extends Mapper<LongWritable, Text, Text, IntWritable>{ private IntWritable count=new IntWritable(1); private Text text=new Text(); @Override protected void map(LongWritable key, Text value,Context context) throws IOException, InterruptedException { String values[]=value.toString().split("#"); //System.out.println(values[0]+"========"+values[1]); count.set(Integer.parseInt(values[1])); text.set(values[0]); context.write(text,count); } } /** * Reducer * * **/ private static class WReducer extends Reducer<Text, IntWritable, Text, Text>{ private Text t=new Text(); @Override protected void reduce(Text key, Iterable<IntWritable> value,Context context) throws IOException, InterruptedException { int count=0; for(IntWritable i:value){ count+=i.get(); } t.set(count+""); context.write(key,t); } } /** * 改动一 * (1)shell源码里添加checkHadoopHome的路径 * (2)974行,FileUtils里面 * **/ public static void main(String[] args) throws Exception{ // String path1=System.getenv("HADOOP_HOME"); // System.out.println(path1); // System.exit(0); JobConf conf=new JobConf(MyWordCount.class); //Configuration conf=new Configuration(); //conf.set("mapred.job.tracker","192.168.75.130:9001"); //读取person中的数据字段 // conf.setJar("tt.jar"); //注意这行代码放在最前面,进行初始化,否则会报 /**Job任务**/ Job job=new Job(conf, "testwordcount"); job.setJarByClass(MyWordCount.class); System.out.println("模式: "+conf.get("mapred.job.tracker"));; // job.setCombinerClass(PCombine.class); // job.setNumReduceTasks(3);//设置为3 job.setMapperClass(WMapper.class); job.setReducerClass(WReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); String path="hdfs://192.168.46.28:9000/qin/output"; FileSystem fs=FileSystem.get(conf); Path p=new Path(path); if(fs.exists(p)){ fs.delete(p, true); System.out.println("输出路径存在,已删除!"); } FileInputFormat.setInputPaths(job, "hdfs://192.168.46.28:9000/qin/input"); FileOutputFormat.setOutputPath(job,p ); System.exit(job.waitForCompletion(true) ? 0 : 1); } } </pre> 控制台,打印日志如下: <pre name="code" class="java">INFO - Configuration.warnOnceIfDeprecated(840) | mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 模式: local 输出路径存在,已删除! INFO - Configuration.warnOnceIfDeprecated(840) | session.id is deprecated. Instead, use dfs.metrics.session-id INFO - JvmMetrics.init(76) | Initializing JVM Metrics with processName=JobTracker, sessionId= WARN - JobSubmitter.copyAndConfigureFiles(149) | Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. WARN - JobSubmitter.copyAndConfigureFiles(258) | No job jar file set. User classes may not be found. See Job or Job#setJar(String). INFO - FileInputFormat.listStatus(287) | Total input paths to process : 1 INFO - JobSubmitter.submitJobInternal(394) | number of splits:1 INFO - Configuration.warnOnceIfDeprecated(840) | user.name is deprecated. Instead, use mapreduce.job.user.name INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.job.name is deprecated. Instead, use mapreduce.job.name INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir INFO - Configuration.warnOnceIfDeprecated(840) | mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps INFO - Configuration.warnOnceIfDeprecated(840) | mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class INFO - Configuration.warnOnceIfDeprecated(840) | mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir INFO - JobSubmitter.printTokens(477) | Submitting tokens for job: job_local1181216011_0001 WARN - Configuration.loadProperty(2172) | file:/root/hadoop/tmp/mapred/staging/qindongliang1181216011/.staging/job_local1181216011_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. WARN - Configuration.loadProperty(2172) | file:/root/hadoop/tmp/mapred/staging/qindongliang1181216011/.staging/job_local1181216011_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. WARN - Configuration.loadProperty(2172) | file:/root/hadoop/tmp/mapred/local/localRunner/qindongliang/job_local1181216011_0001/job_local1181216011_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. WARN - Configuration.loadProperty(2172) | file:/root/hadoop/tmp/mapred/local/localRunner/qindongliang/job_local1181216011_0001/job_local1181216011_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. INFO - Job.submit(1272) | The url to track the job: http://localhost:8080/ INFO - Job.monitorAndPrintJob(1317) | Running job: job_local1181216011_0001 INFO - LocalJobRunner$Job.createOutputCommitter(323) | OutputCommitter set in config null INFO - LocalJobRunner$Job.createOutputCommitter(341) | OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter INFO - LocalJobRunner$Job.run(389) | Waiting for map tasks INFO - LocalJobRunner$Job$MapTaskRunnable.run(216) | Starting task: attempt_local1181216011_0001_m_000000_0 INFO - ProcfsBasedProcessTree.isAvailable(129) | ProcfsBasedProcessTree currently is supported only on Linux. INFO - Task.initialize(581) | Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@39550640 INFO - MapTask.runNewMapper(732) | Processing split: hdfs://192.168.46.28:9000/qin/input/test.txt:0+38 INFO - MapTask.createSortingCollector(387) | Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer INFO - MapTask$MapOutputBuffer.setEquator(1183) | (EQUATOR) 0 kvi 26214396(104857584) INFO - MapTask$MapOutputBuffer.init(975) | mapreduce.task.io.sort.mb: 100 INFO - MapTask$MapOutputBuffer.init(976) | soft limit at 83886080 INFO - MapTask$MapOutputBuffer.init(977) | bufstart = 0; bufvoid = 104857600 INFO - MapTask$MapOutputBuffer.init(978) | kvstart = 26214396; length = 6553600 INFO - LocalJobRunner$Job.statusUpdate(513) | INFO - MapTask$MapOutputBuffer.flush(1440) | Starting flush of map output INFO - MapTask$MapOutputBuffer.flush(1459) | Spilling map output INFO - MapTask$MapOutputBuffer.flush(1460) | bufstart = 0; bufend = 44; bufvoid = 104857600 INFO - MapTask$MapOutputBuffer.flush(1462) | kvstart = 26214396(104857584); kvend = 26214384(104857536); length = 13/6553600 INFO - MapTask$MapOutputBuffer.sortAndSpill(1648) | Finished spill 0 INFO - Task.done(995) | Task:attempt_local1181216011_0001_m_000000_0 is done. And is in the process of committing INFO - LocalJobRunner$Job.statusUpdate(513) | map INFO - Task.sendDone(1115) | Task 'attempt_local1181216011_0001_m_000000_0' done. INFO - LocalJobRunner$Job$MapTaskRunnable.run(241) | Finishing task: attempt_local1181216011_0001_m_000000_0 INFO - LocalJobRunner$Job.run(397) | Map task executor complete. INFO - ProcfsBasedProcessTree.isAvailable(129) | ProcfsBasedProcessTree currently is supported only on Linux. INFO - Task.initialize(581) | Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@68843e7b INFO - Merger$MergeQueue.merge(568) | Merging 1 sorted segments INFO - Merger$MergeQueue.merge(667) | Down to the last merge-pass, with 1 segments left of total size: 45 bytes INFO - LocalJobRunner$Job.statusUpdate(513) | INFO - Configuration.warnOnceIfDeprecated(840) | mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords INFO - Task.done(995) | Task:attempt_local1181216011_0001_r_000000_0 is done. And is in the process of committing INFO - LocalJobRunner$Job.statusUpdate(513) | INFO - Task.commit(1156) | Task attempt_local1181216011_0001_r_000000_0 is allowed to commit now INFO - FileOutputCommitter.commitTask(439) | Saved output of task 'attempt_local1181216011_0001_r_000000_0' to hdfs://192.168.46.28:9000/qin/output/_temporary/0/task_local1181216011_0001_r_000000 INFO - LocalJobRunner$Job.statusUpdate(513) | reduce > reduce INFO - Task.sendDone(1115) | Task 'attempt_local1181216011_0001_r_000000_0' done. INFO - Job.monitorAndPrintJob(1338) | Job job_local1181216011_0001 running in uber mode : false INFO - Job.monitorAndPrintJob(1345) | map 100% reduce 100% INFO - Job.monitorAndPrintJob(1356) | Job job_local1181216011_0001 completed successfully INFO - Job.monitorAndPrintJob(1363) | Counters: 32 File System Counters FILE: Number of bytes read=372 FILE: Number of bytes written=382174 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=76 HDFS: Number of bytes written=27 HDFS: Number of read operations=17 HDFS: Number of large read operations=0 HDFS: Number of write operations=6 Map-Reduce Framework Map input records=4 Map output records=4 Map output bytes=44 Map output materialized bytes=58 Input split bytes=109 Combine input records=0 Combine output records=0 Reduce input groups=3 Reduce shuffle bytes=0 Reduce input records=4 Reduce output records=3 Spilled Records=8 Shuffled Maps =0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=0 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=532938752 File Input Format Counters Bytes Read=38 File Output Format Counters Bytes Written=27 </pre> 输入的测试数据如下: <pre name="code" class="java">中国#1 美国#2 英国#3 中国#2</pre> 输出的结果如下: <pre name="code" class="java">中国 3 美国 2 英国 3 </pre> 至此,我们已经成功的在eclipse里远程调试hadoop成功,调试时,注意散仙,在上文提出的几个问题,如果遇到时,按照对应的方法解决即可。 声明:ITeye文章版权属于作者,受法律保护。没有作者书面许可不得转载。
推荐链接
|
|
返回顶楼 | |
发表时间:2014-06-11
有空trytry
|
|
返回顶楼 | |