Hadoop读取本地文件运算写再写入本地

king_tt

浏览: 2315214 次
性别:
来自: 深圳

最近访客更多访客>>

u012363178

liangjijiang

jacky_dai

ljmomo

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

2017-01 ( 5)
2016-12 ( 132)
2016-11 ( 17)
更多存档...

博客分类：

hadoop大数据

hadoop java Ant Git

前几天给大家写了个hadoop文件系统的操作类，今天来实际应用一下：从本地文件系统读入一个文件，运算后将结果再写回本地。

闲话少说，直接上代码：

public class mywordcount { public static class wordcountMapper extends Mapper<LongWritable, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException{ String line = value.toString(); StringTokenizer itr = new StringTokenizer(line); while(itr.hasMoreElements()){ word.set(itr.nextToken()); context.write(word, one); } } } public static class wordcountReduce extends Reducer<Text, IntWritable, Text, IntWritable>{ public void reduce(Text key, Iterable<IntWritable>values, Context context)throws IOException, InterruptedException{ int sum = 0; for (IntWritable str : values){ sum += str.get(); } context.write(key, new IntWritable(sum)); } } public static void main(String args[])throws Exception{ //首先定义两个临时文件夹，这里可以使用随机函数+文件名，这样重名的几率就很小。 String dstFile = "temp_src"; String srcFile = "temp_dst"; //这里生成文件操作对象。 HDFS_File file = new HDFS_File(); Configuration conf = new Configuration(); //从本地上传文件到HDFS,可以是文件也可以是目录 file.PutFile(conf, args[1], dstFile); System.out.println("up ok"); Job job = new Job(conf, "mywordcount"); job.setJarByClass(mywordcount.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(wordcountMapper.class); job.setReducerClass(wordcountReduce.class); job.setCombinerClass(wordcountReduce.class); //注意这里的输入输出都应该是在HDFS下的文件或目录 FileInputFormat.setInputPaths(job, new Path(dstFile)); FileOutputFormat.setOutputPath(job, new Path(srcFile)); //开始运行 job.waitForCompletion(true); //从HDFS取回文件保存至本地 file.GetFile(conf, srcFile, args[2]); System.out.println("down ok"); //删除临时文件或目录 file.DelFile(conf, dstFile, true); file.DelFile(conf, srcFile, true); System.out.println("del ok"); } }

最后需要注意的是，在使用命令时文件或目录路径要使用绝对路径，防止出错。

分享到：

100本名著浓缩而成的100句话 | HDFS添加和删除节点

2011-03-07 22:26
浏览 913
评论(0)
分类:研发管理
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论