`
xangqun
  • 浏览: 82594 次
  • 性别: Icon_minigender_1
  • 来自: 江西
社区版块
存档分类
最新评论

Lucene学习总结之七:Lucene搜索过程解析(5)转

阅读更多

2.4、搜索查询对象

 

 

2.4.2、创建Scorer及SumScorer对象树

当创建完Weight对象树的时候,调用IndexSearcher.search(Weight, Filter, int),代码如下:

 

//(a)创建文档号收集器

TopScoreDocCollector collector = TopScoreDocCollector.create(nDocs, !weight.scoresDocsOutOfOrder());

search(weight, filter, collector);

//(b)返回搜索结果

return collector.topDocs();

public void search(Weight weight, Filter filter, Collector collector)

    throws IOException {

  if (filter == null) {

    for (int i = 0; i < subReaders.length; i++) {

      collector.setNextReader(subReaders[i], docStarts[i]);

      //(c)创建Scorer对象树,以及SumScorer树用来合并倒排表

      Scorer scorer = weight.scorer(subReaders[i], !collector.acceptsDocsOutOfOrder(), true);

      if (scorer != null) {

        //(d)合并倒排表,(e)收集文档号

        scorer.score(collector);

      }

    }

  } else {

    for (int i = 0; i < subReaders.length; i++) {

      collector.setNextReader(subReaders[i], docStarts[i]);

      searchWithFilter(subReaders[i], weight, filter, collector);

    }

  }

}

在本节中,重点分析(c)创建Scorer对象树,以及SumScorer树用来合并倒排表,在2.4.3节中,分析 (d)合并倒排表,在2.4.4节中,分析文档结果收集器的创建(a),结果文档的收集(e),以及文档的返回(b)

BooleanQuery$BooleanWeight.scorer(IndexReader, boolean, boolean) 代码如下:

 

public Scorer scorer(IndexReader reader, boolean scoreDocsInOrder, boolean topScorer){

  //存放对应于MUST语句的Scorer

  List<Scorer> required = new ArrayList<Scorer>();

  //存放对应于MUST_NOT语句的Scorer

  List<Scorer> prohibited = new ArrayList<Scorer>();

  //存放对应于SHOULD语句的Scorer

  List<Scorer> optional = new ArrayList<Scorer>();

  //遍历每一个子语句,生成子Scorer对象,并加入相应的集合,这是一个递归的过程。

  Iterator<BooleanClause> cIter = clauses.iterator();

  for (Weight w  : weights) {

    BooleanClause c =  cIter.next();

    Scorer subScorer = w.scorer(reader, true, false);

    if (subScorer == null) {

      if (c.isRequired()) {

        return null;

      }

    } else if (c.isRequired()) {

      required.add(subScorer);

    } else if (c.isProhibited()) {

      prohibited.add(subScorer);

    } else {

      optional.add(subScorer);

    }

  }

  //此处在有关BooleanScorer及scoreDocsInOrder一节会详细描述

  if (!scoreDocsInOrder && topScorer && required.size() == 0 && prohibited.size() < 32) {
     return new BooleanScorer(similarity, minNrShouldMatch, optional, prohibited);
  }

  //生成Scorer对象树,同时生成SumScorer对象树

  return new BooleanScorer2(similarity, minNrShouldMatch, required, prohibited, optional);

}

对其叶子节点TermWeight来说,TermQuery$TermWeight.scorer(IndexReader, boolean, boolean) 代码如下:

 

 

public Scorer scorer(IndexReader reader, boolean scoreDocsInOrder, boolean topScorer) throws IOException {

  //此Term的倒排表

  TermDocs termDocs = reader.termDocs(term);

  if (termDocs == null)

    return null;

  return new TermScorer(this, termDocs, similarity, reader.norms(term.field()));

}

 

TermScorer(Weight weight, TermDocs td, Similarity similarity, byte[] norms) {

  super(similarity);

  this.weight = weight;

  this.termDocs = td;

  //得到标准化因子

  this.norms = norms;

  //得到原来计算得的打分:queryNorm*idf^2*t.getBoost()

  this.weightValue = weight.getValue();

  for (int i = 0; i < SCORE_CACHE_SIZE; i++)

    scoreCache[i] = getSimilarity().tf(i) * weightValue;

}

对其叶子节点ConstantWeight来说,ConstantScoreQuery$ConstantWeight.scorer(IndexReader, boolean, boolean) 代码如下:

 

public ConstantScorer(Similarity similarity, IndexReader reader, Weight w) {

  super(similarity);

  theScore = w.getValue();

  //得到所有的文档号,形成统一的倒排表,参与倒排表合并。

  DocIdSet docIdSet = filter.getDocIdSet(reader);

  DocIdSetIterator docIdSetIterator = docIdSet.iterator();

}

对于BooleanWeight,最后要产生的是BooleanScorer2,其构造函数代码如下:

 

 

public BooleanScorer2(Similarity similarity, int minNrShouldMatch,

    List<Scorer> required, List<Scorer> prohibited, List<Scorer> optional) {

  super(similarity);

  //为了计算打分公式中的coord项做统计

  coordinator = new Coordinator();

  this.minNrShouldMatch = minNrShouldMatch;

  //SHOULD的部分 

  optionalScorers = optional;

  coordinator.maxCoord += optional.size();

  //MUST的部分 

  requiredScorers = required;

  coordinator.maxCoord += required.size();

  //MUST_NOT的部分

  prohibitedScorers = prohibited;

  //事先计算好各种情况的coord值

  coordinator.init();

  //创建SumScorer为倒排表合并做准备

  countingSumScorer = makeCountingSumScorer();

}

Coordinator.init() {

  coordFactors = new float[maxCoord + 1];

  Similarity sim = getSimilarity();

  for (int i = 0; i <= maxCoord; i++) {

    //计算总的子语句的个数和一个文档满足的子语句的个数之间的关系,自然是一篇文档满足的子语句个个数越多,打分越高。

    coordFactors[i] = sim.coord(i, maxCoord);

  }

}

在生成Scorer对象树之外,还会生成SumScorer对象树,来表示各个语句之间的关系,为合并倒排表做准备。

在解析BooleanScorer2.makeCountingSumScorer() 之前,我们先来看不同的语句之间都存在什么样的关系,又将如何影响倒排表合并呢?

语句主要分三类:MUST,SHOULD,MUST_NOT

语句之间的组合主要有以下几种情况:

  • 多个MUST,如"(+apple +boy +dog)",则会生成ConjunctionScorer(Conjunction 交集),也即倒排表取交集
  • MUST和SHOULD,如"(+apple boy)",则会生成ReqOptSumScorer(required optional),也即MUST的倒排表返回,如果文档包括SHOULD的部分,则增加打分。
  • MUST和MUST_NOT,如"(+apple –boy)",则会生成ReqExclScorer(required exclusive),也即返回MUST的倒排表,但扣除MUST_NOT的倒排表中的文档。
  • 多个SHOULD,如"(apple boy dog)",则会生成DisjunctionSumScorer(Disjunction 并集),也即倒排表去并集
  • SHOULD和MUST_NOT,如"(apple –boy)",则SHOULD被认为成MUST,会生成ReqExclScorer
  • MUST,SHOULD,MUST_NOT同时出现,则MUST首先和MUST_NOT组合成ReqExclScorer,SHOULD单独成为SingleMatchScorer,然后两者组合成ReqOptSumScorer。

下面分析生成SumScorer的过程:

BooleanScorer2.makeCountingSumScorer() 分两种情况:

  • 当有MUST的语句的时候,则调用makeCountingSumScorerSomeReq()
  • 当没有MUST的语句的时候,则调用makeCountingSumScorerNoReq()

首先来看makeCountingSumScorerSomeReq代码如下:

 

private Scorer makeCountingSumScorerSomeReq() {

  if (optionalScorers.size() == minNrShouldMatch) {

    //如果optional的语句个数恰好等于最少需满足的optional的个数,则所有的optional都变成required。于是首先所有的optional生成ConjunctionScorer(交集),然后再通过addProhibitedScorers将prohibited加入,生成ReqExclScorer(required exclusive)

    ArrayList<Scorer> allReq = new ArrayList<Scorer>(requiredScorers);

    allReq.addAll(optionalScorers);

    return addProhibitedScorers(countingConjunctionSumScorer(allReq));

  } else {

    //首先所有的required的语句生成ConjunctionScorer(交集)

    Scorer requiredCountingSumScorer =

          requiredScorers.size() == 1

          ? new SingleMatchScorer(requiredScorers.get(0))

          : countingConjunctionSumScorer(requiredScorers);

    if (minNrShouldMatch > 0) {

     //如果最少需满足的optional的个数有一定的限制,则意味着optional中有一部分要相当于required,会影响倒排表的合并。因而required生成的ConjunctionScorer(交集)和optional生成的DisjunctionSumScorer(并集)共同组合成一个ConjunctionScorer(交集),然后再加入prohibited,生成ReqExclScorer

      return addProhibitedScorers(

                    dualConjunctionSumScorer(

                            requiredCountingSumScorer,

                            countingDisjunctionSumScorer(

                                    optionalScorers,

                                    minNrShouldMatch)));

    } else { // minNrShouldMatch == 0

      //如果最少需满足的optional的个数没有一定的限制,则optional并不影响倒排表的合并,仅仅在文档包含optional部分的时候增加打分。所以required和prohibited首先生成ReqExclScorer,然后再加入optional,生成ReqOptSumScorer(required optional)

      return new ReqOptSumScorer(

                    addProhibitedScorers(requiredCountingSumScorer),

                    optionalScorers.size() == 1

                      ? new SingleMatchScorer(optionalScorers.get(0))

                      : countingDisjunctionSumScorer(optionalScorers, 1));

    }

  }

}

然后我们来看makeCountingSumScorerNoReq代码如下:

 

private Scorer makeCountingSumScorerNoReq() {

  // minNrShouldMatch optional scorers are required, but at least 1

  int nrOptRequired = (minNrShouldMatch < 1) ? 1 : minNrShouldMatch;

  Scorer requiredCountingSumScorer;

  if (optionalScorers.size() > nrOptRequired)

    //如果optional的语句个数多于最少需满足的optional的个数,则optional中一部分相当required,影响倒排表的合并,所以生成DisjunctionSumScorer

    requiredCountingSumScorer = countingDisjunctionSumScorer(optionalScorers, nrOptRequired);

  else if (optionalScorers.size() == 1)

    //如果optional的语句只有一个,则返回SingleMatchScorer,不存在倒排表合并的问题。

    requiredCountingSumScorer = new SingleMatchScorer(optionalScorers.get(0));

  else

    //如果optional的语句个数少于等于最少需满足的optional的个数,则所有的optional都算required,所以生成ConjunctionScorer

    requiredCountingSumScorer = countingConjunctionSumScorer(optionalScorers);

  //将prohibited加入,生成ReqExclScorer

  return addProhibitedScorers(requiredCountingSumScorer);

}

经过此步骤,生成的Scorer对象树如下:

 

scorer    BooleanScorer2  (id=50)   
   |   coordinator    BooleanScorer2$Coordinator  (id=53)   
   |   countingSumScorer    ReqOptSumScorer  (id=54)    
   |   minNrShouldMatch    0   
   |---optionalScorers    ArrayList<E>  (id=55)   
   |       |  elementData    Object[10]  (id=69)   
   |       |---[0]    BooleanScorer2  (id=73)   
   |              |  coordinator    BooleanScorer2$Coordinator  (id=74)   
   |              |  countingSumScorer    BooleanScorer2$1  (id=75)    
   |              |  minNrShouldMatch    0   
   |              |---optionalScorers    ArrayList<E>  (id=76)   
   |              |       |  elementData    Object[10]  (id=83)   
   |              |       |---[0]    ConstantScoreQuery$ConstantScorer  (id=86)    
   |              |       |       docIdSetIterator    OpenBitSetIterator  (id=88)   
   |              |       |       similarity    DefaultSimilarity  (id=64)   
   |              |       |       theScore    0.47844642   

   |              |       |       //ConstantScore(contents:cat*)
   |              |       |       this$0    ConstantScoreQuery  (id=90)   
   |              |       |---[1]    TermScorer  (id=87)   
   |              |              doc    -1   
   |              |              doc    0   
   |              |              docs    int[32]  (id=93)   
   |              |              freqs    int[32]  (id=95)   
   |              |              norms    byte[4]  (id=96)   
   |              |              pointer    0   
   |              |              pointerMax    2   
   |              |              scoreCache    float[32]  (id=98)   
   |              |              similarity    DefaultSimilarity  (id=64)   
   |              |              termDocs    SegmentTermDocs  (id=103)   

   |              |              //weight(contents:dog)
   |              |              weight    TermQuery$TermWeight  (id=106)   
   |              |              weightValue    1.1332052    
   |              |       modCount    2   
   |              |       size    2   
   |              |---prohibitedScorers    ArrayList<E>  (id=77)   
   |              |        elementData    Object[10]  (id=84)    
   |              |        size    0   
   |              |---requiredScorers    ArrayList<E>  (id=78)   
   |                       elementData    Object[10]  (id=85)    
   |                       size    0   
   |             similarity    DefaultSimilarity  (id=64)    
   |     size    1   
   |---prohibitedScorers    ArrayList<E>  (id=60)   
   |       |  elementData    Object[10]  (id=71)   
   |       |---[0]    BooleanScorer2  (id=81)   
   |              |  coordinator    BooleanScorer2$Coordinator  (id=114)   
   |              |  countingSumScorer    BooleanScorer2$1  (id=115)    
   |              |  minNrShouldMatch    0   
   |              |---optionalScorers    ArrayList<E>  (id=116)   
   |              |       |  elementData    Object[10]  (id=119)   
   |              |       |---[0]    BooleanScorer2  (id=122)   
   |              |       |       |  coordinator    BooleanScorer2$Coordinator  (id=124)   
   |              |       |       |  countingSumScorer    BooleanScorer2$1  (id=125)    
   |              |       |       |  minNrShouldMatch    0   
   |              |       |       |---optionalScorers    ArrayList<E>  (id=126)   
   |              |       |       |       |  elementData    Object[10]  (id=138)   
   |              |       |       |       |---[0]    TermScorer  (id=156)    
   |              |       |       |       |       docs    int[32]  (id=162)   
   |              |       |       |       |       freqs    int[32]  (id=163)   
   |              |       |       |       |       norms    byte[4]  (id=96)   
   |              |       |       |       |       pointer    0   
   |              |       |       |       |       pointerMax    1   
   |              |       |       |       |       scoreCache    float[32]  (id=164)   
   |              |       |       |       |       similarity    DefaultSimilarity  (id=64)   
   |              |       |       |       |       termDocs    SegmentTermDocs  (id=165) 

   |              |       |       |       |       //weight(contents:eat)  
   |              |       |       |       |       weight    TermQuery$TermWeight  (id=166)   
   |              |       |       |       |       weightValue    2.107161   
   |              |       |       |       |---[1]    TermScorer  (id=157)   
   |              |       |       |              doc    -1   
   |              |       |       |              doc    1   
   |              |       |       |              docs    int[32]  (id=171)   
   |              |       |       |              freqs    int[32]  (id=172)   
   |              |       |       |              norms    byte[4]  (id=96)   
   |              |       |       |              pointer    1   
   |              |       |       |              pointerMax    3   
   |              |       |       |              scoreCache    float[32]  (id=173)   
   |              |       |       |              similarity    DefaultSimilarity  (id=64)   
   |              |       |       |              termDocs    SegmentTermDocs  (id=180)   

   |              |       |       |             //weight(contents:cat^0.33333325)
   |              |       |       |              weight    TermQuery$TermWeight  (id=181)   
   |              |       |       |              weightValue    0.22293752    
   |              |       |       |          size    2   
   |              |       |       |---prohibitedScorers    ArrayList<E>  (id=127)   
   |              |       |       |        elementData    Object[10]  (id=140)   
   |              |       |       |        modCount    0   
   |              |       |       |        size    0   
   |              |       |       |---requiredScorers    ArrayList<E>  (id=128)   
   |              |       |               elementData    Object[10]  (id=142)   
   |              |       |               modCount    0   
   |              |       |               size    0   
   |              |       |      similarity    BooleanQuery$1  (id=129)   
   |              |       |---[1]    TermScorer  (id=123)   
   |              |              doc    -1   
   |              |              doc    3   
   |              |              docs    int[32]  (id=131)   
   |              |              freqs    int[32]  (id=132)   
   |              |              norms    byte[4]  (id=96)   
   |              |              pointer    0   
   |              |              pointerMax    1   
   |              |              scoreCache    float[32]  (id=133)   
   |              |              similarity    DefaultSimilarity  (id=64)   
   |              |              termDocs    SegmentTermDocs  (id=134)   

   |              |             //weight(contents:foods)
   |              |             weight    TermQuery$TermWeight  (id=135)   
   |              |             weightValue    2.107161    
   |              |         size    2   
   |              |---prohibitedScorers    ArrayList<E>  (id=117)   
   |              |       elementData    Object[10]  (id=120)    
   |              |       size    0   
   |              |---requiredScorers    ArrayList<E>  (id=118)   
   |                      elementData    Object[10]  (id=121)    
   |                      size    0   
   |             similarity    DefaultSimilarity  (id=64)    
   |     size    1   
   |---requiredScorers    ArrayList<E>  (id=63)   
           |  elementData    Object[10]  (id=72)   
           |---[0]    BooleanScorer2  (id=82)    
                  |    coordinator    BooleanScorer2$Coordinator  (id=183)   
                  |    countingSumScorer    ReqExclScorer  (id=184)    
                  |    minNrShouldMatch    0   
                  |---optionalScorers    ArrayList<E>  (id=185)   
                  |       elementData    Object[10]  (id=189)    
                  |       size    0   
                  |---prohibitedScorers    ArrayList<E>  (id=186)   
                  |       |  elementData    Object[10]  (id=191)   
                  |       |---[0]    TermScorer  (id=195)    
                  |                docs    int[32]  (id=197)   
                  |                freqs    int[32]  (id=198)   
                  |                norms    byte[4]  (id=96)   
                  |                pointer    0   
                  |                pointerMax    0   
                  |                scoreCache    float[32]  (id=199)   
                  |                similarity    DefaultSimilarity  (id=64)   
                  |                termDocs    SegmentTermDocs  (id=200)   

                  |                //weight(contents:boy)
                  |                weight    TermQuery$TermWeight  (id=201)   
                  |                weightValue    2.107161     
                  |         size    1   
                  |---requiredScorers    ArrayList<E>  (id=187)   
                          |   elementData    Object[10]  (id=193)   
                          |---[0]    ConstantScoreQuery$ConstantScorer  (id=203)    
                                  docIdSetIterator    OpenBitSetIterator  (id=206)   
                                  similarity    DefaultSimilarity  (id=64)   
                                  theScore    0.47844642   

                                  //ConstantScore(contents:apple*)
                                  this$0    ConstantScoreQuery  (id=207)    
                        size    1   
                similarity    DefaultSimilarity  (id=64)    
        size    1   
    similarity    DefaultSimilarity  (id=64)   

 

生成的SumScorer对象树如下:

 

scorer    BooleanScorer2  (id=50)   
  |    coordinator    BooleanScorer2$Coordinator  (id=53)   
  |---countingSumScorer    ReqOptSumScorer  (id=54)    
            |---optScorer    BooleanScorer2$SingleMatchScorer  (id=79)    
            |       |    lastDocScore    NaN   
            |       |    lastScoredDoc    -1   
            |       |---scorer    BooleanScorer2  (id=73)   
            |                |    coordinator    BooleanScorer2$Coordinator  (id=74)   
            |                |---countingSumScorer    BooleanScorer2$1(DisjunctionSumScorer) (id=75)   
            |                          |    currentDoc    -1   
            |                          |    currentScore    NaN   
            |                          |    doc    -1   
            |                          |    lastDocScore    NaN   
            |                          |    lastScoredDoc    -1   
            |                          |    minimumNrMatchers    1   
            |                          |    nrMatchers    -1   
            |                          |    nrScorers    2   
            |                          |    scorerDocQueue    ScorerDocQueue  (id=243)   
            |                          |    similarity    null   
            |                          |---subScorers    ArrayList<E>  (id=76)   
            |                                    |  elementData    Object[10]  (id=83)   
            |                                   |---[0]    ConstantScoreQuery$ConstantScorer  (id=86)   
            |                                    |        doc    -1   
            |                                    |        doc    -1   
            |                                    |        docIdSetIterator    OpenBitSetIterator  (id=88)   
            |                                    |        similarity    DefaultSimilarity  (id=64)   
            |                                    |        theScore    0.47844642   

            |                                    |        //ConstantScore(contents:cat*)
            |                                    |        this$0    ConstantScoreQuery  (id=90)   
            |                                    |---[1]    TermScorer  (id=87)   
            |                                             doc    -1    
            |                                             doc    0   
            |                                             docs    int[32]  (id=93)   
            |                                             freqs    int[32]  (id=95)   
            |                                             norms    byte[4]  (id=96)   
            |                                             pointer    0   
            |                                             pointerMax    2   
            |                                             scoreCache    float[32]  (id=98)   
            |                                             similarity    DefaultSimilarity  (id=64)   
            |                                             termDocs    SegmentTermDocs  (id=103)  

            |                                             //weight(contents:dog) 
            |                                             weight    TermQuery$TermWeight  (id=106)   
            |                                             weightValue    1.1332052    
            |                size    2   
            |            this$0    BooleanScorer2  (id=73)    
            |        minNrShouldMatch    0   
            |        optionalScorers    ArrayList<E>  (id=76)   
            |        prohibitedScorers    ArrayList<E>  (id=77)   
            |        requiredScorers    ArrayList<E>  (id=78)   
            |        similarity    DefaultSimilarity  (id=64)   
            |    similarity    DefaultSimilarity  (id=64)   
            |    this$0    BooleanScorer2  (id=50)   
            |---reqScorer    ReqExclScorer  (id=80)    
                     |---exclDisi    BooleanScorer2  (id=81)    
                     |         |    coordinator    BooleanScorer2$Coordinator  (id=114)   
                     |         |---countingSumScorer    BooleanScorer2$1(DisjunctionSumScorer) (id=115)   
                     |                    |    currentDoc    -1   
                     |                    |    currentScore    NaN   
                     |                    |    doc    -1   
                     |                    |    lastDocScore    NaN   
                     |                    |    lastScoredDoc    -1   
                     |                    |    minimumNrMatchers    1   
                     |                    |    nrMatchers    -1   
                     |                    |    nrScorers    2   
                     |                    |    scorerDocQueue    ScorerDocQueue  (id=260)   
                     |                    |    similarity    null   
                     |                    |---subScorers    ArrayList<E>  (id=116)   
                     |                              |  elementData    Object[10]  (id=119)   
                     |                              |---[0]    BooleanScorer2  (id=122)    
                     |                              |       |    coordinator    BooleanScorer2$Coordinator  (id=124)   
                     |                              |       |---countingSumScorer    BooleanScorer2$1(DisjunctionSumScorer) (id=125)   
                     |                              |                  |    currentDoc    0   
                     |                              |                  |    currentScore    0.11146876   
                     |                              |                  |    doc    -1   
                     |                              |                  |    lastDocScore    NaN   
                     |                              |                  |    lastScoredDoc    -1   
                     |                              |                  |    minimumNrMatchers    1   
                     |                              |                  |    nrMatchers    1   
                     |                              |                  |    nrScorers    2   
                     |                              |                  |    scorerDocQueue    ScorerDocQueue  (id=270)   
                     |                              |                  |    similarity    null   
                     |                              |                  |---subScorers    ArrayList<E>  (id=126)   
                     |                              |                            |    elementData    Object[10]  (id=138)   
                     |                              |                            |---[0]    TermScorer  (id=156)   
                     |                              |                            |           doc    -1   
                     |                              |                            |           doc    2   
                     |                              |                            |           docs    int[32]  (id=162)   
                     |                              |                            |           freqs    int[32]  (id=163)   
                     |                              |                            |           norms    byte[4]  (id=96)   
                     |                              |                            |           pointer    0   
                     |                              |                            |           pointerMax    1   
                     |                              |                            |           scoreCache    float[32]  (id=164)   
                     |                              |                            |           similarity    DefaultSimilarity  (id=64)   
                     |                              |                            |           termDocs    SegmentTermDocs  (id=165) 

                     |                              |                            |           //weight(contents:eat)  
                     |                              |                            |           weight    TermQuery$TermWeight  (id=166)   
                     |                              |                            |           weightValue    2.107161   
                     |                              |                            |---[1]    TermScorer  (id=157)   
                     |                              |                                        doc    -1   
                     |                              |                                        doc    1   
                     |                              |                                        docs    int[32]  (id=171)   
                     |                              |                                        freqs    int[32]  (id=172)   
                     |                              |                                        norms    byte[4]  (id=96)   
                     |                              |                                        pointer    1   
                     |                              |                                        pointerMax    3   
                     |                              |                                        scoreCache    float[32]  (id=173)   
                     |                              |                                        similarity    DefaultSimilarity  (id=64)   
                     |                              |                                        termDocs    SegmentTermDocs  (id=180)   

                     |                              |                                        //weight(contents:cat^0.33333325)
                     |                              |                                       weight    TermQuery$TermWeight  (id=181)   
                     |                              |                                       weightValue    0.22293752    
                     |                              |                                    size    2   
                     |                              |                         this$0    BooleanScorer2  (id=122)   
                     |                              |             doc    -1   
                     |                              |             doc    0   
                     |                              |             minNrShouldMatch    0   
                     |                              |             optionalScorers    ArrayList<E>  (id=126)   
                     |                              |             prohibitedScorers    ArrayList<E>  (id=127)   
                     |                              |             requiredScorers    ArrayList<E>  (id=128)   
                     |                              |             similarity    BooleanQuery$1  (id=129)   
                     |                              |---[1]    TermScorer  (id=123)   
                     |                                            doc    -1    
                     |                                            doc    3   
                     |                                            docs    int[32]  (id=131)   
                     |                                            freqs    int[32]  (id=132)   
                     |                                            norms    byte[4]  (id=96)   
                     |                                            pointer    0   
                     |                                            pointerMax    1   
                     |                                            scoreCache    float[32]  (id=133)   
                     |                                            similarity    DefaultSimilarity  (id=64)   
                     |                                            termDocs    SegmentTermDocs  (id=134)  

                     |                                           //weight(contents:foods) 
                     |                                           weight    TermQuery$TermWeight  (id=135)   
                     |                                           weightValue    2.107161    
                     |                                   size    2   
                     |                         this$0    BooleanScorer2  (id=81)   
                     |               doc    -1   
                     |               doc    -1   
                     |               minNrShouldMatch    0   
                     |               optionalScorers    ArrayList<E>  (id=116)   
                     |               prohibitedScorers    ArrayList<E>  (id=117)   
                     |               requiredScorers    ArrayList<E>  (id=118)   
                     |               similarity    DefaultSimilarity  (id=64)   
                     |---reqScorer    BooleanScorer2$SingleMatchScorer  (id=237)   
                                |    doc    -1    
                                |    lastDocScore    NaN   
                                |    lastScoredDoc    -1   
                                |---scorer    BooleanScorer2  (id=82)   
                                         |    coordinator    BooleanScorer2$Coordinator  (id=183)   
                                         |---countingSumScorer    ReqExclScorer  (id=184)    
                                                    |---exclDisi    TermScorer  (id=195)   
                                                    |        doc    -1   
                                                    |        doc    -1   
                                                    |        docs    int[32]  (id=197)   
                                                    |        freqs    int[32]  (id=198)   
                                                    |        norms    byte[4]  (id=96)   
                                                    |        pointer    0   
                                                    |        pointerMax    0   
                                                    |        scoreCache    float[32]  (id=199)   
                                                    |        similarity    DefaultSimilarity  (id=64)   
                                                    |        termDocs    SegmentTermDocs  (id=200)  

                                                    |        //weight(contents:boy) 
                                                    |        weight    TermQuery$TermWeight  (id=201)   
                                                    |        weightValue    2.107161   
                                                    |---reqScorer    BooleanScorer2$2(ConjunctionScorer)  (id=281)   
                                                             |     coord    1.0   
                                                             |     doc    -1    
                                                             |     lastDoc    -1   
                                                             |     lastDocScore    NaN   
                                                             |     lastScoredDoc    -1   
                                                             |---scorers    Scorer[1]  (id=283)    
                                                                      |---[0]    ConstantScoreQuery$ConstantScorer  (id=203)    
                                                                                doc    -1   
                                                                                doc    -1   
                                                                                docIdSetIterator    OpenBitSetIterator  (id=206)   
                                                                                similarity    DefaultSimilarity  (id=64)   
                                                                                theScore    0.47844642 

                                                                               //ConstantScore(contents:apple*)  
                                                                               this$0    ConstantScoreQuery  (id=207)   
                                                                 similarity    DefaultSimilarity  (id=64)   
                                                                 this$0    BooleanScorer2  (id=82)   
                                                                 val$requiredNrMatchers    1   
                                                           similarity    null     
                                                minNrShouldMatch    0   
                                                optionalScorers    ArrayList<E>  (id=185)   
                                                prohibitedScorers    ArrayList<E>  (id=186)   
                                                requiredScorers    ArrayList<E>  (id=187)   
                                                similarity    DefaultSimilarity  (id=64)   
                                     similarity    DefaultSimilarity  (id=64)   
                                     this$0    BooleanScorer2  (id=50)   
                          similarity    null   
                 similarity    null    
       minNrShouldMatch    0   
       optionalScorers    ArrayList<E>  (id=55)   
       prohibitedScorers    ArrayList<E>  (id=60)   
       requiredScorers    ArrayList<E>  (id=63)   
       similarity    DefaultSimilarity  (id=64)   

转:http://forfuture1978.iteye.com/blog/632840

分享到:
评论

相关推荐

    Lucene学习源码.rar

    通过学习Lucene源码,我们可以定制自己的分词器、查询解析器,甚至优化搜索算法,以满足特定的搜索需求。例如,在中文环境下,可以使用IK Analyzer或者jieba分词库来增强对中文的支持。 总结,Lucene作为Java平台上...

    IKAnalyzer中文分词支持lucene6.5.0版本

    由于林良益先生在2012之后未对IKAnalyzer进行更新,后续lucene分词接口发生变化,导致不可使用,所以此jar包支持lucene6.0以上版本

    lucene学习资料收集

    5. **搜索(Searching)**:通过查询对象,Lucene能高效地在索引中查找匹配的文档,返回排名靠前的结果。搜索算法基于TF-IDF和 BM25等。 6. **排序与评分(Scoring)**:Lucene根据相关性对搜索结果进行排序,...

    lucene学习总结

    **Lucene学习总结** 在深入理解Lucene之前,我们首先需要了解什么是全文检索。全文检索是一种从大量文本数据中快速查找所需信息的技术。它通过建立索引来实现高效的搜索,而Lucene正是Java环境下最著名的全文搜索...

    Lucene的的学习资料及案例

    **Lucene学习指南** Lucene是一个高性能、全文检索库,由Apache软件基金会开发并维护,是Java编程语言中广泛使用的搜索引擎库。它提供了一个简单的API,使得开发者能够方便地在应用中实现全文检索功能。本篇文章将...

    Lucene5学习之Group分组统计

    "Lucene5学习之Group分组统计" 这个标题指出我们要讨论的是关于Apache Lucene 5版本中的一个特定功能——Grouping。在信息检索领域,Lucene是一个高性能、全文搜索引擎库,而Grouping是它提供的一种功能,允许用户对...

    Lucene5学习之Facet(续)

    《Lucene5学习之Facet(续)》 在深入探讨Lucene5的Facet功能之前,我们先来了解一下什么是Faceting。Faceting是搜索引擎提供的一种功能,它允许用户通过分类或属性对搜索结果进行细分,帮助用户更精确地探索和理解...

    lucene学习pdf2

    《Lucene深度解析与Luke工具应用》 Lucene,作为Apache软件基金会的开源全文搜索引擎库,已经在信息检索领域扮演了重要角色。它提供了一个高效、可扩展的搜索平台,广泛应用于各种网站、企业系统和大数据分析中。...

    Lucene5学习之排序-Sort

    “Lucene5学习之排序-Sort”这个标题表明了我们要探讨的是关于Apache Lucene 5版本中的排序功能。Lucene是一个高性能、全文检索库,它提供了强大的文本搜索能力。在这个主题中,我们将深入理解如何在Lucene 5中对...

    Lucene 7.2.1 官方jar包

    总结来说,Lucene 7.2.1 是一个强大的全文检索工具,通过其丰富的功能和高效性能,为开发者提供了构建强大搜索引擎的可能。对于需要处理大量文本数据的应用,使用Lucene进行索引和查询无疑是一个明智的选择。

    Lucene3.3.0学习Demo

    **Lucene 3.3.0 学习Demo** ...总之,"Lucene3.3.0学习Demo"是一个宝贵的资源,对于想要掌握全文搜索技术的开发者来说,它提供了丰富的实践案例和学习材料,可以帮助你快速上手并深入理解Lucene的核心机制。

    lucene 最新版本所有jar包

    同时,它还包含分词器(Analyzer)用于将文本分割成可搜索的词元,以及查询解析器(QueryParser)将用户输入转化为搜索查询。 `lucene-analyzers-common-4.10.2.jar`是Lucene的通用分析器包。分析器是处理文本的...

    lucene学习lucene学习

    Lucene 是一个强大的全文搜索引擎库,它以 Java 语言实现,并作为 Apache 软件基金会的 Apache Jakarta 项目的一部分开放源代码。Lucene 提供了高效、可扩展的索引和搜索功能,允许开发者轻松地在应用程序中集成高级...

    Lucene5学习之Suggest关键字提示

    《深入探索Lucene5:Suggest关键字提示技术》 在信息检索领域,用户输入查询时,提供快速、准确的关键字提示能显著提升用户体验。Lucene,作为Java领域最流行的全文检索库,其5.x版本引入了Suggest组件,用于实现...

    Lucene搜索技术

    【Lucene搜索技术】是一种基于Java的全文索引引擎工具包,它并非一个完整的全文搜索引擎,而是提供了一套用于构建全文检索应用的API。Lucene的主要目标是方便开发者将其嵌入到各种应用程序中,实现对特定数据源的...

    Lucene5学习之分页查询

    本文将深入探讨"Lucene5学习之分页查询"这一主题,结合给定的标签"源码"和"工具",我们将讨论如何在Lucene5中实现高效的分页查询,并探讨其背后的源码实现。 首先,理解分页查询的重要性是必要的。在大型数据集的...

    lucene学习资料

    5. **搜索(Searching)**:通过查询对象,Lucene在索引中进行匹配,找出与之相关的文档。匹配度通过评分系统(Scoring)来衡量,通常基于TF-IDF(词频-逆文档频率)算法。 6. **高亮显示(Highlighting)**:为了...

    lucene学习总结_博客记录1

    本篇文章将深入探讨 Lucene 的核心原理,从全文检索的基础概念出发,逐步解析索引创建过程以及搜索机制。 一、全文检索的基本原理 1. 总论 全文检索是通过索引机制,快速找到文档中包含特定关键词的过程。Lucene ...

    Lucene原理及使用总结

    总的来说,Lucene提供了一套完整的框架,涵盖了从文本处理到搜索结果返回的全过程,使开发者能够专注于构建具有高级搜索功能的应用,而无需关心底层实现细节。通过理解Lucene的基本原理和使用方法,我们可以构建出...

Global site tag (gtag.js) - Google Analytics