- 浏览: 149264 次
- 性别:
- 来自: 北京
文章分类
最新评论
-
EclipseEye:
fair_jm 写道不错 蛮详细的 谢谢分享
SWT/JFace专题 --- SWT中Display和多线程 -
fair_jm:
不错 蛮详细的 谢谢分享
SWT/JFace专题 --- SWT中Display和多线程
@Public
@Stable
A Writable which is also Comparable.
WritableComparables can be compared to each other, typically via Comparators. Any type which is to be used as a key in the Hadoop Map-Reduce framework should implement this interface.
Note that hashCode() is frequently used in Hadoop to partition keys. It's important that your implementation of hashCode() returns the same result across different instances of the JVM. Note also that the default hashCode() implementation in Object does not satisfy this property.
Example:
public class MyWritableComparable implements WritableComparable {
// Some data
private int counter;
private long timestamp;
public void write(DataOutput out) throws IOException {
out.writeInt(counter);
out.writeLong(timestamp);
}
public void readFields(DataInput in) throws IOException {
counter = in.readInt();
timestamp = in.readLong();
}
public int compareTo(MyWritableComparable o) {
int thisValue = this.value;
int thatValue = o.value;
return (thisValue < thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
}
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + counter;
result = prime * result + (int) (timestamp ^ (timestamp >>> 32));
return result
}
}
--------------
org.apache.hadoop.io.Writable
@Public
@Stable
A serializable object which implements a simple, efficient, serialization protocol, based on DataInput and DataOutput.
Any key or value type in the Hadoop Map-Reduce framework implements this interface.
Implementations typically implement a static read(DataInput) method which constructs a new instance, calls readFields(DataInput) and returns the instance.
Example:
public class MyWritable implements Writable {
// Some data
private int counter;
private long timestamp;
public void write(DataOutput out) throws IOException {
out.writeInt(counter);
out.writeLong(timestamp);
}
public void readFields(DataInput in) throws IOException {
counter = in.readInt();
timestamp = in.readLong();
}
public static MyWritable read(DataInput in) throws IOException {
MyWritable w = new MyWritable();
w.readFields(in);
return w;
}
}
@Stable
A Writable which is also Comparable.
WritableComparables can be compared to each other, typically via Comparators. Any type which is to be used as a key in the Hadoop Map-Reduce framework should implement this interface.
Note that hashCode() is frequently used in Hadoop to partition keys. It's important that your implementation of hashCode() returns the same result across different instances of the JVM. Note also that the default hashCode() implementation in Object does not satisfy this property.
Example:
public class MyWritableComparable implements WritableComparable {
// Some data
private int counter;
private long timestamp;
public void write(DataOutput out) throws IOException {
out.writeInt(counter);
out.writeLong(timestamp);
}
public void readFields(DataInput in) throws IOException {
counter = in.readInt();
timestamp = in.readLong();
}
public int compareTo(MyWritableComparable o) {
int thisValue = this.value;
int thatValue = o.value;
return (thisValue < thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
}
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + counter;
result = prime * result + (int) (timestamp ^ (timestamp >>> 32));
return result
}
}
@InterfaceAudience.Public @InterfaceStability.Stable public interface WritableComparable<T> extends Writable, Comparable<T> { }
--------------
org.apache.hadoop.io.Writable
@Public
@Stable
A serializable object which implements a simple, efficient, serialization protocol, based on DataInput and DataOutput.
Any key or value type in the Hadoop Map-Reduce framework implements this interface.
Implementations typically implement a static read(DataInput) method which constructs a new instance, calls readFields(DataInput) and returns the instance.
Example:
public class MyWritable implements Writable {
// Some data
private int counter;
private long timestamp;
public void write(DataOutput out) throws IOException {
out.writeInt(counter);
out.writeLong(timestamp);
}
public void readFields(DataInput in) throws IOException {
counter = in.readInt();
timestamp = in.readLong();
}
public static MyWritable read(DataInput in) throws IOException {
MyWritable w = new MyWritable();
w.readFields(in);
return w;
}
}
@InterfaceAudience.Public @InterfaceStability.Stable public interface Writable { /** * Serialize the fields of this object to <code>out</code>. * * @param out <code>DataOuput</code> to serialize this object into. * @throws IOException */ void write(DataOutput out) throws IOException; /** * Deserialize the fields of this object from <code>in</code>. * * <p>For efficiency, implementations should attempt to re-use storage in the * existing object where possible.</p> * * @param in <code>DataInput</code> to deseriablize this object from. * @throws IOException */ void readFields(DataInput in) throws IOException; }
发表评论
-
数据迁移相关(关系型数据库mysql,oracle和nosql数据库如hbase)
2015-04-01 15:15 738HBase数据迁移(1) http://www.importn ... -
zookeeper适用场景:如何竞选Master及代码实现
2015-04-01 14:53 796zookeeper适用场景:如何竞选Master及代码实现 h ... -
MR/hive 数据去重
2015-04-01 14:43 738海量数据去重的五大策略 http://www.ciotimes ... -
面试牛x题
2015-03-18 23:50 0hive、mr(各需三道) 1.分别使用Hadoop MapR ... -
使用shell并发上传文件到hdfs
2015-03-16 21:41 1274使用shell并发上传文件到hdfs http://mos19 ... -
hadoop集群监控工具Apache Ambari
2015-03-14 17:27 0Apache Ambari官网 http://ambari.a ... -
Hadoop MapReduce优化相关
2015-03-16 21:46 474[大牛翻译系列]Hadoop 翻译文章索引 http://ww ... -
数据倾斜问题 牛逼(1)数据倾斜之MapReduce&hive
2015-03-16 21:43 805数据倾斜总结 http://www.alidata.org/a ... -
MapReduce牛逼(3)(继承WritableComparable)实现自定义key键,实现二重排序
2015-03-12 08:57 649package sort; import jav ... -
MapReduce牛逼(2)MR简单实现 导入数据到hbase例子
2015-03-12 08:57 1285package cmd; /** * MapRe ... -
MapReduce牛逼(1)MR单词计数例子
2015-03-11 00:44 1214package cmd; import org. ... -
InputFormat牛逼(9)FileInputFormat实现类之SequenceFileInputFormat
2015-03-11 00:24 1410一、SequenceFileInputFormat及Seque ... -
InputFormat牛逼(8)FileInputFormat实现类之TextInputFormat
2015-03-11 00:19 583/** An {@link InputFormat} for ... -
InputFormat牛逼(6)org.apache.hadoop.mapreduce.lib.db.DBRecordReader<T>
2015-03-11 00:11 679@Public @Evolving A RecordRead ... -
InputFormat牛逼(5)org.apache.hadoop.mapreduce.lib.db.DBInputFormat<T>
2015-03-10 23:10 605@Public @Stable A InputFormat ... -
InputFormat牛逼(4)org.apache.hadoop.mapreduce.RecordReader<KEYIN, VALUEIN>
2015-03-10 22:50 373@Public @Stable The record rea ... -
InputFormat牛逼(3)org.apache.hadoop.mapreduce.InputFormat<K, V>
2015-03-10 22:46 665@Public @Stable InputFormat d ... -
InputFormat牛逼(2)org.apache.hadoop.mapreduce.InputSplit & DBInputSplit
2015-03-10 22:22 538@Public @Stable InputSplit rep ... -
InputFormat牛逼(1)org.apache.hadoop.mapreduce.lib.db.DBWritable
2015-03-10 22:07 558@Public @Stable Objects that a ... -
如何把hadoop2 的job作业 提交到 yarn平台
2015-01-08 21:09 0aaa萨芬撒点
相关推荐
实现了WritableComparable接口的类,不仅可以将对象写入到Hadoop的数据流中,还能在MapReduce框架中比较这些对象,这对于排序、分组等操作是必不可少的。 接下来,我们以Person类为例,介绍如何自定义一个数据类型...
MapReduce是一种分布式计算模型,由Google在2004年提出,主要用于处理和生成大规模数据集。它将复杂的并行计算任务分解为两个主要阶段:Map(映射)和Reduce(化简)。在这个综合案例中,我们将探讨四个具体的应用...
Hadoop提供了一系列预定义的数据类型,如IntWritable、LongWritable、Text等,它们都实现了WritableComparable接口,可以直接在MapReduce程序中使用。如果需要自定义数据类型,需要实现Writable接口,如果该类型...
其他重要类与接口包括Configuration类、Job类、Writable接口和WritableComparable接口。Configuration类读取配置文件。Job类配置、提交Job,控制其执行,查询其状态。Writable接口序列化输入输出。...
输入数据和输出数据都以这种形式存在,且key和value需实现Writable接口以便序列化,key还需要实现WritableComparable接口以支持排序。Map阶段,每个map任务接收输入数据,通过map函数生成中间结果。Reduce阶段,相同...
实验项目“MapReduce 编程”旨在让学生深入理解并熟练运用MapReduce编程模型,这是大数据处理领域中的核心技术之一。实验内容涵盖了从启动全分布模式的Hadoop集群到编写、运行和分析MapReduce应用程序的全过程。 ...
4. 决策树算法在MapReduce中的优化:在基于MapReduce实现决策树算法中,需要对决策树算法进行优化,以提高计算速度和效率。例如,可以对决策树算法的计算过程进行并行化,对Mapper和Reducer的计算过程进行优化等。 ...
该代码还使用了Hadoop MapReduce框架中的各种类和接口,例如Configuration、Job、Mapper、Reducer、Context等,来实现MapReduce任务的配置、执行和监控。 该基于MapReduce的Apriori算法代码实现了关联规则挖掘的...
MapReduce是一种分布式计算模型,由Google开发,用于处理和生成大量数据。这个模型主要由两个主要阶段组成:Map(映射)和Reduce(规约)。MapReduce的核心思想是将复杂的大规模数据处理任务分解成一系列可并行执行...
大数据分析技术基础PPT课件(共9单元)4-MapReduce 编程.pdf大数据分析技术基础PPT课件(共9单元)4-MapReduce 编程.pdf大数据分析技术基础PPT课件(共9单元)4-MapReduce 编程.pdf大数据分析技术基础PPT课件(共9单元)4-...
(4)完成上课老师演示的内容 二、实验环境 Windows 10 VMware Workstation Pro虚拟机 Hadoop环境 Jdk1.8 二、实验内容 1.单词计数实验(wordcount) (1)输入start-all.sh启动hadoop相应进程和相关的端口号 (2)...
2. **编程接口**:MapReduce提供了编程接口,包括Mapper类和Reducer类,用户需要继承这两个类并重写其方法。Mapper处理输入键值对,Reducer进行数据聚合。此外,还有Partitioner用于控制数据分发,Combiner用于本地...
【标题】Hadoop MapReduce 实现 WordCount MapReduce 是 Apache Hadoop 的核心组件之一,它为大数据处理提供了一个分布式计算框架。WordCount 是 MapReduce 框架中经典的入门示例,它统计文本文件中每个单词出现的...
### MapReduce的实现细节 #### 一、MapReduce框架概述 MapReduce是一种广泛应用于大数据处理领域的分布式编程模型,最初由Google提出并在其内部系统中得到广泛应用。随着开源社区的发展,尤其是Apache Hadoop项目...
MapReduce的主要贡献在于提供了简单而强大的接口,使自动并行化和分布式处理大规模计算成为可能。其高性能实现能够在大量廉价PC组成的集群上运行,展现出卓越的性能表现。此外,MapReduce的编程模型也可以用于在同一...
### MapReduce基础知识详解 #### 一、MapReduce概述 **MapReduce** 是一种编程模型,最初由Google提出并在Hadoop中实现,用于处理大规模数据集的分布式计算问题。该模型的核心思想是将复杂的大型计算任务分解成较...
MapReduce是一种编程模型,用于大规模数据集的并行运算。它最初由Google提出,其后发展为Apache Hadoop项目中的一个核心组件。在这一框架下,开发者可以创建Map函数和Reduce函数来处理数据。MapReduce设计模式是对...
MapReduce是一种分布式计算模型,由Google在2004年提出,主要用于处理和生成大规模数据集。它将复杂的并行计算任务分解成两个主要阶段:Map(映射)和Reduce(化简)。在这个"MapReduce项目 数据清洗"中,我们将探讨...
MapReduce是一种分布式计算模型,由Google在2004年提出,主要用于处理和生成大规模数据集。这个模型将复杂的计算任务分解成两个主要阶段:Map(映射)和Reduce(化简),使得在大规模分布式环境下处理大数据变得可能...