InputFormat牛逼（8）FileInputFormat实现类之TextInputFormat

EclipseEye

浏览: 151618 次
性别:
来自: 北京

最近访客更多访客>>

chenqisdfx

xiaohuohaoxiao

The魂狩

小小云麓

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Hadoop/MapReaduce

/** An {@link InputFormat} for plain text files.  Files are broken into lines.
 * Either linefeed or carriage-return are used to signal end of line.  Keys are
 * the position in the file, and values are the line of text.. */
@InterfaceAudience.Public
@InterfaceStability.Stable
public class TextInputFormat extends FileInputFormat<LongWritable, Text> {

  @Override
  public RecordReader<LongWritable, Text> 
    createRecordReader(InputSplit split,
                       TaskAttemptContext context) {
    String delimiter = context.getConfiguration().get(
        "textinputformat.record.delimiter");
    byte[] recordDelimiterBytes = null;
    if (null != delimiter)
      recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
    return new LineRecordReader(recordDelimiterBytes);
  }

  @Override
  protected boolean isSplitable(JobContext context, Path file) {
    final CompressionCodec codec =
      new CompressionCodecFactory(context.getConfiguration()).getCodec(file);
    if (null == codec) {
      return true;
    }
    return codec instanceof SplittableCompressionCodec;
  }

}

分享到：

InputFormat牛逼（9）FileInputFormat实 ... | InputFormat牛逼（6）org.apache.hadoop. ...

2015-03-11 00:19
浏览 595
评论(0)
分类:互联网
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

InputFormat牛逼（8）FileInputFormat实现类之TextInputFormat

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

InputFormat牛逼（8）FileInputFormat实现类之TextInputFormat

评论

发表评论

相关推荐

数据迁移相关（关系型数据库mysql，oracle和nosql数据库如hbase）

zookeeper适用场景：如何竞选Master及代码实现

MR/hive 数据去重

面试牛x题

使用shell并发上传文件到hdfs

hadoop集群监控工具Apache Ambari

Hadoop MapReduce优化相关

数据倾斜问题 牛逼（1）数据倾斜之MapReduce&hive

MapReduce牛逼（4）WritableComparable接口

MapReduce牛逼（3）（继承WritableComparable)实现自定义key键，实现二重排序

MapReduce牛逼（2）MR简单实现 导入数据到hbase例子

MapReduce牛逼（1）MR单词计数例子

InputFormat牛逼（9）FileInputFormat实现类之SequenceFileInputFormat

InputFormat牛逼（6）org.apache.hadoop.mapreduce.lib.db.DBRecordReader<T>

InputFormat牛逼（5）org.apache.hadoop.mapreduce.lib.db.DBInputFormat<T>

InputFormat牛逼（4）org.apache.hadoop.mapreduce.RecordReader<KEYIN, VALUEIN>

InputFormat牛逼（3）org.apache.hadoop.mapreduce.InputFormat<K, V>

InputFormat牛逼（2）org.apache.hadoop.mapreduce.InputSplit & DBInputSplit

InputFormat牛逼（1）org.apache.hadoop.mapreduce.lib.db.DBWritable

如何把hadoop2 的job作业 提交到 yarn平台

最近访客更多访客>>

数据倾斜问题牛逼（1）数据倾斜之MapReduce&hive

MapReduce牛逼（2）MR简单实现导入数据到hbase例子

如何把hadoop2 的job作业提交到 yarn平台