Hadoop MapReduce 学习笔记(一) 序言和准备

guoyunsky

浏览: 863376 次
性别:
来自: 上海

最近访客更多访客>>

sdzhaoweiji

hywa

chen88358323

jinky2004

博主相关

博客

微博

相册

留言

关于我

博客专栏

: Heritrix源码分析
浏览量：208024

: SQL的MapReduce...
浏览量：0

文章分类

社区版块

存档分类

博客分类：

Hadoop
MapReduce

本博客属原创文章,转载请注明出处:http://guoyunsky.iteye.com/blog/1233707

下一篇: Hadoop MapReduce 学习笔记(二) 序言和准备 2

终于踏入了Hadoop的世界,先学习了Sqoop,然后MapReduce.这里结合MapReduce实现类似SQL的各种功能.如:max,min,order by,inner/left/right join group by等.但这里只是我的一个学习的过程,还有很多不足和错误.但我会步步深入不断改进,希望也能帮助到大家.同时今后也会不断跟进,比如读PIG/Hive的源码,看他们如何组织,如何写MapReduce.以及工作过程中一些实践经验和心得.毕竟这块资料还是比较少,尤其是系统性的.

这里我先贴上几个准备类,用于生成测试数据.以及答个测试框架.

首先贴上测试父类,具体请看注释:

package com.guoyun.hadoop.mapreduce.study;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
 * MapReduce 测试父类
 */
public abstract class MyMapReduceTest {
  
  public static final Logger log=LoggerFactory.getLogger(MyMapReduceTest.class);
  public static final String DEFAULT_INPUT_PATH="testDatas/mapreduce/MRInput";
  public static final String DEFAULT_OUTPUT_PATH="testDatas/mapreduce/MROutput";
  public static final String NEW_LINE="\r";
  public static final int DEFAULT_LENGTH=1000;
  protected String inputPath=DEFAULT_INPUT_PATH;    // hadoop input
  protected String outputPath=DEFAULT_OUTPUT_PATH;  // hadoop output
  protected boolean isGenerateDatas=false;          // 是否生成测试数据
  protected long maxValue=Long.MIN_VALUE;           // 生成数的最大值,以便跟结果比较
  protected long minValue=Long.MAX_VALUE;           // 生成数的最小值,以便跟结果比较
  
  
  public MyMapReduceTest(long dataLength) throws Exception {
    this(dataLength,DEFAULT_INPUT_PATH,DEFAULT_OUTPUT_PATH);
  }
  
  /**
   * 该构造方法不会自动生成数据
   * @param outputPath
   */
  public MyMapReduceTest(String outputPath) {
    this.outputPath=outputPath;
  }
  
  /**
   * 该构造方法不会自动生成数据,同时会重用input的输入数据
   * @param outputPath
   */
  public MyMapReduceTest(String inputPath,String outputPath) {
    this.inputPath=inputPath;
    this.outputPath=outputPath;
  }
  
  
  public MyMapReduceTest(long dataLength,String inputPath, String outputPath) throws Exception {
    this.inputPath = inputPath;
    this.outputPath = outputPath;
    isGenerateDatas=true;
    init(dataLength);
  }

  public String getInputPath() {
    return inputPath;
  }

  public void setInputPath(String inputPath) {
    this.inputPath = inputPath;
  }

  public String getOutputPath() {
    return outputPath;
  }

  public void setOutputPath(String outputPath) {
    this.outputPath = outputPath;
  }

  public long getMaxValue() {
    return maxValue;
  }

  public void setMaxValue(long maxValue) {
    this.maxValue = maxValue;
  }
  
  public long getMinValue() {
    return minValue;
  }

  public void setMinValue(long minValue) {
    this.minValue = minValue;
  }

  public boolean isGenerateDatas() {
    return isGenerateDatas;
  }
  
  /**
   * 初始化,根据设置，会自动生成测试数据
   * 
   * @param length
   * @throws Exception
   */
  private void init(long length) throws Exception{
    if(isGenerateDatas){
      generateDatas(length);
    }
  }
  
  /**
   * 生成测试数据，写入inputPath.
   * 根据不同的测试需要，由子类完成
   * 
   * @param length
   * @throws Exception
   */
  protected abstract void generateDatas(long length) throws Exception;
 
}

更多技术文章、感悟、分享、勾搭，请用微信扫描:

分享到：

Hadoop MapReduce 学习笔记(二) 序言和 ... | Sqoop源码分析(四) Sqoop中通过hadoop m ...

2011-11-03 18:11
浏览 3893
评论(0)
论坛回复 / 浏览 (0 / 1887)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

博客专栏

文章分类

社区版块

存档分类

最新评论

Hadoop MapReduce 学习笔记(一) 序言和准备

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

博客专栏

文章分类

社区版块

存档分类

最新评论

Hadoop MapReduce 学习笔记(一) 序言和准备

评论

发表评论

相关推荐

Apache Tajo介绍

HiveQL(Hive SQL)跟普通SQL最大区别

Elephantbird的安装和使用

Elephantbird介绍

Apache Hive 0.10.0发布

Apache Pig 0.10.1发布

<转载> MapReduce关键流程代码分析

Hadoop Pipes程序运行Server failed to authenticate错误解决

Hadoop Pipes运行ant -Dcompile.c++=yes examples报错解决

Hadoop Oozie Error: E0301 : E0301: Invalid resource [hdfs://xxx]问题解决

Hadoop Pig源码分析(一) Pig加载配置的四种方式

Hadoop Lzo 源码分析之分片/切片原理

Hadoop IO学习(一) Protocol Buffer的Java使用

Hadoop Oozie学习笔记(七) E0903: Configuration does not have Jobtracker Kerberos异常解决

Eclipse中运行Sqoop诡异问题解决

Hadoop Oozie 学习笔记(六) Hadoop Oozie概述

Hadoop Pig学习笔记(一) 各种SQL在PIG中实现

Hadoop Oozie学习笔记(五) E0720: Fork/join mismatch, node [join_node_name]异常解决

Hadoop lzo 正确安装及问题解决

Hadoop Core 学习笔记(二) lzo文件的写入和读取

最近访客更多访客>>