ToolRunner机制

linest

浏览: 157127 次
性别:
来自: 内蒙古

最近访客更多访客>>

cnspary

给我用用

和平共处

l00o00l

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

mahout

定义框架接口
由具体实现类实现

public interface Tool extends Configurable {
  int run(String [] args) throws Exception;
}

ToolRunner
同一的入口调用
按配置解析参数，调用接口方法

  public static int run(Configuration conf, Tool tool, String[] args) 
    throws Exception{
    if(conf == null) {
      conf = new Configuration();
    }
    GenericOptionsParser parser = new GenericOptionsParser(conf, args);
    //set the configuration back, so that Tool can configure itself
    tool.setConf(conf);
    
    //get the args w/o generic hadoop args
    String[] toolArgs = parser.getRemainingArgs();
    return tool.run(toolArgs);
  }

Mahout 中具体调用示例

  public static void main(String[] args) throws Exception {
    ToolRunner.run(new Configuration(), new MinHashDriver(), args);
  }

覆盖方法,提取参数，调用核心方法

  @Override
  public int run(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
    addInputOption();
    addOutputOption();
    //...........
    runJob(input,
           output,
           minClusterSize,
           minVectorSize,
           hashType,
           numHashFunctions,
           keyGroups,
           numReduceTasks,
           debugOutput);
    return 0;
  }

核心方法，配置job，开始map reduce任务

  private void runJob(Path input, 
                      Path output,
                      int minClusterSize,
                      int minVectorSize, 
                      String hashType, 
                      int numHashFunctions, 
                      int keyGroups,
                      int numReduceTasks, 
                      boolean debugOutput) throws IOException, ClassNotFoundException, InterruptedException {
    Configuration conf = getConf();

    //配置参数设置........................    
    Job job = new Job(conf, "MinHash Clustering");
    job.setJarByClass(MinHashDriver.class);

    //Job参数设置.........................
    job.waitForCompletion(true);
  }

查看图片附件

分享到：

读代码-MinHashDriver及相关 | strlen 注意

2012-01-26 11:57
浏览 3576
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

ToolRunner机制

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

ToolRunner机制

评论

发表评论

相关推荐

Mahout LDA CVB

读代码-LDADriver及相关

读代码-BayesFileFormatter

读代码-CanopyDriver及相关

读代码-MinHashDriver及相关

mahout 启动对应

读代码-Pattern和FrequentPatternMaxHeap

读代码-TransactionTree

读代码-FPGrowthDriver及相关

读代码-TopKStringPatterns

读代码-BayesDriver及相关

读代码-TrainClassifier和TestClassifier

读代码-RandomSeedGenerator

读代码-VectorWritable

读代码-Vector

读代码-KMeansDriver

读代码-SequenceFilesFromDirectory

读代码-InputMapper

最近访客更多访客>>