MapReduce牛逼（3）（继承WritableComparable)实现自定义key键，实现二重排序

EclipseEye

浏览: 151602 次
性别:
来自: 北京

最近访客更多访客>>

chenqisdfx

xiaohuohaoxiao

The魂狩

小小云麓

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Hadoop/MapReaduce


package sort;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.WritableComparable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.partition.HashPartitioner;

public class SortText {
	

	private static final String INPUT_PATH = "hdfs://hadoop.master:9000/data1";
	private static final String OUTPUT_PATH = "hdfs://hadoop.master:9000/outSort";

	public static void main(String[] args) throws Exception { 

		FileSystem fileSystem = FileSystem.get(new Configuration());
		boolean exists = fileSystem.exists(new Path(OUTPUT_PATH));
		if(exists){
			fileSystem.delete(new Path(OUTPUT_PATH),true);
		}
		
		Job job=new Job(new Configuration(),SortText.class.getName());
		job.setJarByClass(SortText.class);
		
		job.setInputFormatClass(TextInputFormat.class);
		FileInputFormat.setInputPaths(job,new Path(INPUT_PATH));
		
		job.setMapperClass(MyMapper.class);
		job.setMapOutputKeyClass(MyKey.class);
		job.setMapOutputValueClass(LongWritable.class);
		
		job.setPartitionerClass(HashPartitioner.class);
		job.setNumReduceTasks(1);
		
//		job.setGroupingComparatorClass(cls);
		
		job.setReducerClass(MyReducer.class);
		job.setOutputKeyClass(LongWritable.class);
		job.setOutputValueClass(LongWritable.class);
		job.setOutputFormatClass(TextOutputFormat.class);
		FileOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH));
		
		job.waitForCompletion(true);
		
	}

	static class MyMapper extends
			Mapper<LongWritable, Text, MyKey, LongWritable> {
		protected void map(
				LongWritable key,
				Text value,
				org.apache.hadoop.mapreduce.Mapper<LongWritable, Text, MyKey, LongWritable>.Context context)
				throws java.io.IOException, InterruptedException {

			String[] split = value.toString().split("\t");
			context.write(
					new MyKey(Long.parseLong(split[0]), Long
							.parseLong(split[1])),
					new LongWritable(Long.parseLong(split[1])));

		};
	}

	static class MyReducer extends
			Reducer<MyKey, LongWritable, LongWritable, LongWritable> {
		protected void reduce(
				MyKey arg0,
				java.lang.Iterable<LongWritable> arg1,
				org.apache.hadoop.mapreduce.Reducer<MyKey, LongWritable, LongWritable, LongWritable>.Context arg2)
				throws java.io.IOException, InterruptedException {

//			for (LongWritable w : arg1) {
				arg2.write(new LongWritable(arg0.k), new LongWritable(arg0.v));
//			}

		};
	}

	static class MyKey implements WritableComparable<MyKey> {

		long k;
		long v;

		public MyKey() {
		}

		public MyKey(long k, long v) {
			this.k = k;
			this.v = v;
		}

		@Override
		public void write(DataOutput out) throws IOException {
			out.writeLong(k);
			out.writeLong(v);
		}

		@Override
		public void readFields(DataInput in) throws IOException {
			this.k = in.readLong();
			this.v = in.readLong();
		}

		@Override
		public int compareTo(MyKey o) {
			if (this.k == o.k) {
				return (int) (this.v - o.v);// v
			} else {
				return (int) (o.k - this.k);// k
			}
		}

		@Override
		public int hashCode() {
			final int prime = 31;
			int result = 1;
			result = prime * result + (int) (k ^ (k >>> 32));
			result = prime * result + (int) (v ^ (v >>> 32));
			return result;
		}

		@Override
		public boolean equals(Object obj) {
			if (this == obj)
				return true;
			if (obj == null)
				return false;
			if (getClass() != obj.getClass())
				return false;
			MyKey other = (MyKey) obj;
			if (k != other.k)
				return false;
			if (v != other.v)
				return false;
			return true;
		}

	}

}

分享到：

MapReduce牛逼（4）WritableComparable接 ... | MapReduce牛逼（2）MR简单实现导入数据 ...

2015-03-12 08:57
浏览 658
评论(0)
分类:互联网
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

MapReduce牛逼（3）（继承WritableComparable)实现自定义key键，实现二重排序

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

MapReduce牛逼（3）（继承WritableComparable)实现自定义key键，实现二重排序

评论

发表评论

相关推荐

数据迁移相关（关系型数据库mysql，oracle和nosql数据库如hbase）

zookeeper适用场景：如何竞选Master及代码实现

MR/hive 数据去重

面试牛x题

使用shell并发上传文件到hdfs

hadoop集群监控工具Apache Ambari

Hadoop MapReduce优化相关

数据倾斜问题 牛逼（1）数据倾斜之MapReduce&hive

MapReduce牛逼（4）WritableComparable接口

MapReduce牛逼（2）MR简单实现 导入数据到hbase例子

MapReduce牛逼（1）MR单词计数例子

InputFormat牛逼（9）FileInputFormat实现类之SequenceFileInputFormat

InputFormat牛逼（8）FileInputFormat实现类之TextInputFormat

InputFormat牛逼（6）org.apache.hadoop.mapreduce.lib.db.DBRecordReader<T>

InputFormat牛逼（5）org.apache.hadoop.mapreduce.lib.db.DBInputFormat<T>

InputFormat牛逼（4）org.apache.hadoop.mapreduce.RecordReader<KEYIN, VALUEIN>

InputFormat牛逼（3）org.apache.hadoop.mapreduce.InputFormat<K, V>

InputFormat牛逼（2）org.apache.hadoop.mapreduce.InputSplit & DBInputSplit

InputFormat牛逼（1）org.apache.hadoop.mapreduce.lib.db.DBWritable

如何把hadoop2 的job作业 提交到 yarn平台

最近访客更多访客>>

数据倾斜问题牛逼（1）数据倾斜之MapReduce&hive

MapReduce牛逼（2）MR简单实现导入数据到hbase例子

如何把hadoop2 的job作业提交到 yarn平台