solr分布式请求的判断——HttpShardHandler -

suichangkele

浏览: 197852 次
性别:
来自: 北京

最近访客更多访客>>

jieyuan_cg

z9780420

jzhfmm

geeksun

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

solr分布式请求的判断——HttpShardHandler

博客分类：

solr

solr 分布式请求 HttpShardHandler

在solrCloud中，我们发起的一次查询请求绝大部分是朝多个shard发起的请求，但是可能存在这么一个情况，我已经知道这次要查询的sahrd是哪一个了，那么如何只向一个shard发起请求呢？这个博客就是看看solrCloud对分布式请求的判断，代码在HttpShardHandler中，看看checkDistributed方法：

 /** 
   * 判断这次请求是不是分布式的请求，根据是不是有zk，
   * 如果是的话，则找到由Router决定的要路由到的多个shard，
   * 并添加多个shard的多个replica的url，用|分隔，放在rb的shard和slices中 
   */
  @Override
  public void checkDistributed(ResponseBuilder rb) {
    
    SolrQueryRequest req = rb.req;
    SolrParams params = req.getParams();
    
    rb.isDistrib = params.getBool("distrib", req.getCore().getCoreDescriptor().getCoreContainer().isZooKeeperAware());// 先检查distrib这个参数，如果指定了则使用，否则默认值是是否启动了zk.
    String shards = params.get(ShardParams.SHARDS);// 参数中指定的shards参数。
    
    // for back compat, a shards param with URLs like localhost:8983/solr will mean that this
    // search is distributed.
    boolean hasShardURL = shards != null && shards.indexOf('/') > 0;
    rb.isDistrib = hasShardURL | rb.isDistrib;//由distrib、是否使用zk、是否制定了shards三个参数决定一个请求是否是分布式的，即是否要向多个shard转发请求。
    
    if (rb.isDistrib) {// 如果是分布式的。
      
      // since the cost of grabbing cloud state is still up in the air, we grab it only if we need it.
      ClusterState clusterState = null;
      Map<String,Slice> slices = null;
      CoreDescriptor coreDescriptor = req.getCore().getCoreDescriptor();
      CloudDescriptor cloudDescriptor = coreDescriptor.getCloudDescriptor();
      ZkController zkController = coreDescriptor.getCoreContainer().getZkController();
      
      if (shards != null) {// 如果在请求的参数中指定了shards，则使用给定的shards
        List<String> lst = StrUtils.splitSmart(shards, ",", true);// 可以指定多个要查询的shard，用英文的逗号分隔。
        rb.shards = lst.toArray(new String[lst.size()]);
        rb.slices = new String[rb.shards.length];
        
        if (zkController != null) {
          // figure out which shards are slices
          for (int i = 0; i < rb.shards.length; i++) {
            if (rb.shards[i].indexOf('/') < 0) {
              // this is a logical shard
              rb.slices[i] = rb.shards[i];
              rb.shards[i] = null;
            }
          }
        }
      } else if (zkController != null) {// 如果没有指定shards并且使用了zk
        
        // we weren't provided with an explicit list of slices to query via "shards", so use the cluster state
        clusterState = zkController.getClusterState();
        String shardKeys = params.get(ShardParams._ROUTE_);// shardKeys就是参数中的_route_，这个指定要路由到的shard，对于任何的Router都可以使用这个值（像Implicit这个Router可以使用域的名字来指定要查找的shard）。
        
        // This will be the complete list of slices we need to query for this request.
        slices = new HashMap<>();
        
        // we need to find out what collections this request is for.
        
        // A comma-separated list of specified collections.
        // Eg: "collection1,collection2,collection3"
        String collections = params.get("collection");// 得到collection，可能有多个collection，有,分隔。
        if (collections != null) {
          // If there were one or more collections specified in the query, split
          // each parameter and store as a separate member of a List.
          List<String> collectionList = StrUtils.splitSmart(collections, ",", true);
          // In turn, retrieve the slices that cover each collection from the
          // cloud state and add them to the Map 'slices'.
          for (String collectionName : collectionList) {// 假设只有一个collection.
            // The original code produced <collection-name>_<shard-name> when the collections
            // parameter was specified (see ClientUtils.appendMap)
            // Is this necessary if ony one collection is specified?
            // i.e. should we change multiCollection to collectionList.size() > 1?
            addSlices(slices, clusterState, params, collectionName, shardKeys, true);// 根据这个collection的路由策略和参数找到所有要请求的shard。这个方法的实现要涉及到docRouter，关于这个博客参见http://suichangkele.iteye.com/blog/2363305这个博客。
          }
        } else {
          // just this collection
          String collectionName = cloudDescriptor.getCollectionName();
          addSlices(slices, clusterState, params, collectionName, shardKeys, false);
        }
        
        // Store the logical slices in the ResponseBuilder and create a new
        // String array to hold the physical shards (which will be mapped
        // later).
        rb.slices = slices.keySet().toArray(new String[slices.size()]);
        rb.shards = new String[rb.slices.length];
      }

读完了这个代码，便明白了solrCloud对分布式请求的路由的规则，如果我们指定了shards就会使用查找的shard，如果没有指定，则使用collection中的DocRouter根据参数中的_router_来决定要路由到的shard。对于DocRouter的操作在http://suichangkele.iteye.com/blog/2363305这个博客中写了。

分享到：

solr、lucene的效率分析的一个文章 | 实现得分的PrefixQuery

2017-03-17 10:13
浏览 870
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

solr分布式请求的判断——HttpShardHandler

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

solr分布式请求的判断——HttpShardHandler

评论

发表评论

相关推荐

solr的facet源码解读（四）——facet.field之非数字单值域类型

solr的facet源码解读（三）——facet.field之数字单值域类型

solr的facet源码解读（二）——facet.field

lucene中关于正向信息的获取——FielldCache

solr的facet源码解读（一）——facet.query

solr(lucene)的reRank的核心实现源码解读

solr中的filterCache使用场景源码解读

solr(lucene)中的value source

关于functionQuery的一个误区

solr的主从复制实现原理

solr VS es

solr中的reload

solr中schema.xml中域的omitNorm属性

solr中的dismax解析器

solr中的同义词配置以及关键源码解读

如何查看solr中cache的使用情况

solr中与SolrIndexSearcher相关的其他配置

solr中的SolrEventListener以及cache统计信息的获得

solr的warm

solr的cache在SolrIndexSearcher中的使用

最近访客更多访客>>