浅谈MapReduce编程三

fushengfei

浏览: 211243 次
性别:
来自: 北京

最近访客更多访客>>

Sobfist

kidlovec

413899327

gaoshaoye

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

分布式计算

编程 Mapreduce Hadoop

（3）下面实现一个自己的InputFormat,需要处理的数据为（时间：URL）

  public class TimeUrlTextInputInputFormat extends FileInputFormat<Text,URLWritable>{
       public RecordReader<Text,URLWritable> getRecordReader(
         InputSplit input,JobConf job,Reporter reporter)throws IOException{
      return new TimeUrlLineRecorderReader(job,(FileSplit)input);
}  }


pulbic class URLWritable implements Writable{
protected URL url;
public URLWritable(){}
public URLWritable(URL url){
  This.url=url;
}
public void write(DataOutput out) throws IOException{
   Out.writeUTF(url.toString());
}
public void readFields(DataInput in) throws IOException{
   url=new URL(in.readUTF());
}
public void set(String s)throws MalformedURLException{
   Url=new URL(s);
}
} 


class TimeUrlLineRecordReader implements RecordReader<Text,URLWritable>{
   private KeyValueLineRecorderReader lineReader; 
   private Text lineKey,lineValue;
   public TimeUrlLineRecordReader(JobConf job,FileSplit split) throws IOException{
     lineRecorder=new  KeyValueLineRecordReader(job,split);
     lineKey=lineReader.createKey();
     lineValue=lineReader.createValue();
   } 
   public boolean next(Text key,URLWritable value) throws IOException{
     if(!lineReader.next(lineKey,lineValue)){
        Return false;
     }
     key.set(lineKey);
     Value.set(lineValue.toString());
    return true;
   }
   public Text createKey(){
    Return new Text("");
   }
  public URLWritable createValue(){
     return new URLWritable(); 
 }
  public long getPos() throws IOException{
   Return lineRecorder.getPos();
  }
  public float getProgress() throws IOException{
   Return lineReader.getProgress();
}
public void close() throws IOException{
   lineReader.close();
}
}

七、输出格式 outputFormat

hadoop中实现了OutputFormat接口的类有如下几个

TextOutputFormat<K,V>:用tab键分隔输出，可以通过mapred.textoutputformat.separator

属性进行更换。

SequenceFileOutputFormat<K,V>:和SequeceFileOutputFormat搭配使用

NullOutputFormat<K,V>:什么都不输出

分享到：

浅谈struts2请求处理过程 | 浅谈MapReduce编程二

2010-12-03 12:33
浏览 1774
评论(0)
分类:企业架构
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

浅谈MapReduce编程三

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

浅谈MapReduce编程三

评论

发表评论

相关推荐

hadoop-0.19.0在linux下的集群配置

浅谈MapReduce编程二

浅谈MapReduce编程一

hadoop在windows下的配置与运行（运行环境和开发环境配置）

最近访客更多访客>>