`

Lucene 3.0.2 Query.setBoost() 问题

阅读更多
在Lucene 3.0.2中,在Field 、 Document 和 Query中都有setBoost接口,但是为什么在Query中设置boost值,在搜索结果中却没有任何变化呢?求高人指教啊。。。 代码如下:
package com.eric.lucene;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;

public class ScoreSortTest {
	public static void main(String[] args) throws Exception {
		Directory dir = new RAMDirectory();
		IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), true, IndexWriter.MaxFieldLength.LIMITED);
		
		Document doc1 = new Document();
		Document doc2 = new Document();
		Document doc3 = new Document();
		
		doc1.add(new Field("bookname","thinking in java", Field.Store.YES, Field.Index.ANALYZED));
		doc2.add(new Field("bookname","thinking in java java java", Field.Store.YES, Field.Index.ANALYZED));
		doc3.add(new Field("bookname","thinking in c++", Field.Store.YES, Field.Index.ANALYZED));
		
		writer.addDocument(doc1);
		writer.addDocument(doc2);
		writer.addDocument(doc3);
		
		writer.optimize();
		writer.close();
		
		IndexSearcher searcher = new IndexSearcher(dir);
		Query query = new TermQuery(new Term("bookname","java"));
//		query.setBoost(2);
		
		TopScoreDocCollector collector = TopScoreDocCollector.create(100, false);
		searcher.search(query, collector);
		
		ScoreDoc[] hits = collector.topDocs().scoreDocs;
		for(int i=0; i<hits.length;i++){
			Document doc = searcher.doc(hits[i].doc);
			System.out.println(doc.getBoost());
			System.out.print(doc.get("bookname") + "\t\t");
			System.out.println(hits[i].score);
			System.out.println(searcher.explain(query, hits[i].doc));
		}
	}
}


在没有query.setBoost(2);的情况下,结果如下:
引用

1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
  1.7320508 = tf(termFreq(bookname:java)=3)
  1.0 = idf(docFreq=2, maxDocs=3)
  0.5 = fieldNorm(field=bookname, doc=1)

1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
  1.0 = tf(termFreq(bookname:java)=1)
  1.0 = idf(docFreq=2, maxDocs=3)
  0.625 = fieldNorm(field=bookname, doc=0)

在有query.setBoost(2);的情况下,结果如下:
引用

1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
  1.7320508 = tf(termFreq(bookname:java)=3)
  1.0 = idf(docFreq=2, maxDocs=3)
  0.5 = fieldNorm(field=bookname, doc=1)

1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
  1.0 = tf(termFreq(bookname:java)=1)
  1.0 = idf(docFreq=2, maxDocs=3)
  0.625 = fieldNorm(field=bookname, doc=0)



在Field和Document中setBoost的值,在搜索结果中是有变化的。(因为Field都相同,没有进行尝试,但和Document一样,都是将boost值设置到了索引中)。代码如下:
package com.eric.lucene;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;

public class ScoreSortTest {
	public static void main(String[] args) throws Exception {
		Directory dir = new RAMDirectory();
		IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), true, IndexWriter.MaxFieldLength.LIMITED);
		
		Document doc1 = new Document();
		Document doc2 = new Document();
		Document doc3 = new Document();
		
		doc1.add(new Field("bookname","thinking in java", Field.Store.YES, Field.Index.ANALYZED));
		doc1.setBoost(4);
		doc2.add(new Field("bookname","thinking in java java java", Field.Store.YES, Field.Index.ANALYZED));
		doc3.add(new Field("bookname","thinking in c++", Field.Store.YES, Field.Index.ANALYZED));
		
		writer.addDocument(doc1);
		writer.addDocument(doc2);
		writer.addDocument(doc3);
		
		writer.optimize();
		writer.close();
		
		IndexSearcher searcher = new IndexSearcher(dir);
		Query query = new TermQuery(new Term("bookname","java"));
		
		TopScoreDocCollector collector = TopScoreDocCollector.create(100, false);
		searcher.search(query, collector);
		
		ScoreDoc[] hits = collector.topDocs().scoreDocs;
		for(int i=0; i<hits.length;i++){
			Document doc = searcher.doc(hits[i].doc);
			System.out.println(doc.getBoost());
			System.out.print(doc.get("bookname") + "\t\t");
			System.out.println(hits[i].score);
			System.out.println(searcher.explain(query, hits[i].doc));
		}
	}
}


在没有doc1.setBoost(4);这一行的时候,结果如下:
引用

1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
  1.7320508 = tf(termFreq(bookname:java)=3)
  1.0 = idf(docFreq=2, maxDocs=3)
  0.5 = fieldNorm(field=bookname, doc=1)

1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
  1.0 = tf(termFreq(bookname:java)=1)
  1.0 = idf(docFreq=2, maxDocs=3)
  0.625 = fieldNorm(field=bookname, doc=0)


在有doc1.setBoost(4);这一行的时候,结果如下:
引用

1.0
thinking in java 2.5
2.5 = (MATCH) fieldWeight(bookname:java in 0), product of:
  1.0 = tf(termFreq(bookname:java)=1)
  1.0 = idf(docFreq=2, maxDocs=3)
  2.5 = fieldNorm(field=bookname, doc=0)

1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
  1.7320508 = tf(termFreq(bookname:java)=3)
  1.0 = idf(docFreq=2, maxDocs=3)
  0.5 = fieldNorm(field=bookname, doc=1)
2
0
分享到:
评论

相关推荐

    lucene-3.0.2.zip

    lucene-3.0.2.zip lucene-3.0.2.zip

    lucene-core-3.0.2.jar,lucene-demos-3.0.2.jar

    《深入理解Lucene 3.0.2:核心与演示》 在信息技术领域,搜索引擎的构建是至关重要的一环,而Lucene作为一个开源全文检索库,为开发者提供了强大的文本搜索功能。这里我们主要聚焦于Lucene 3.0.2版本,通过分析其...

    lucene3.0.2

    lucene3.0.2包含lucene-analyzers-3.0.2.jar,lucene-core-3.0.2.jar,lucene-highlighter-3.0.2.jar,lucene-memory-3.0.2.jar等jar包使用lucene实现分词搜索

    lucene 3.0.2

    lucene library. lucene-demos-XX.jar The compiled simple example code. luceneweb.war The compiled simple example Web Application. contrib/* Contributed code which extends and enhances Lucene, but...

    lucene-memory-3.0.2.jar

    lucene-memory-3.0.2.jar,lucene高亮显示中不可少的jar包lucene-memory-*.jar

    lucene3.0.2 jar包

    博客上的例子用到的LUCENE3.0.2版本的jar包

    Lucene.3.0.2版本的相关文件

    包括了lucene-core-3.0.2.jar,IKAnalyzer3.2.0Stable.jar,lucene-analyzers-2.3.0.jar,lucene-highlighter-3.0.2-sources.jar,lucene-memory-3.0.2.jar,最新的停词字典stopword.rar

    lucene-highlighter.jar

    lucene-highlighter.jar lucene-highlighter.jar

    lucene 3.0.2 core+src+javadoc

    《Apache Lucene 3.0.2:全文搜索的核心与深度探索》 Apache Lucene 是一个高度可扩展的开源全文搜索引擎库,它为开发者提供了在应用程序中实现复杂搜索功能的基础框架。版本3.0.2是这个项目的一个里程碑,包含了...

    apache-lucene-analyzers.jar

    比如,英文的 LowerCaseFilter 会将所有词元转换为小写,避免大小写敏感的问题。 3. 特殊语言分析器:针对不同语言的特点,Lucene提供了特定的分析器,如德语分析器(GermanAnalyzer)、法语分析器(FrenchAnalyzer...

    lucene-analysis.jar

    《深入理解Lucene分析器库:lucene-analysis.jar解析》 在信息检索和搜索引擎领域,Apache Lucene是一个广泛使用的开源全文检索库。它的核心功能包括文档的索引、搜索以及相关的高级特性。其中,"lucene-analysis....

    Lucene学习源码.rar

    4. `org.apache.lucene.search.Query` 和 `org.apache.lucene.queryparser.classic.QueryParser`:理解查询的构建和解析过程。 5. `org.apache.lucene.search.Searcher`:研究搜索过程,特别是如何计算相关性和返回...

    lucene3.0.2jar包

    《深入解析Lucene 3.0.2:Java全文搜索引擎的核心技术》 Lucene是一个开源的、基于Java的全文搜索引擎库,它为开发者提供了构建高效、可扩展的搜索功能所需要的核心工具。在3.0.2这个版本中,Lucene已经经过了多次...

    lucene-3.0.2-src.zip 源码

    lucene-3.0.2-src.zip 源码

    Lucene 3.0.2 API DOC

    Lucene 3.0.2 API DOC CHM 是开发的必备工具之一

    lucene-demos-3.0.2.jar

    lucene-demos-3.0.2.jar 搜索引擎

    lucene_cn.jar

    例如,`org.apache.lucene.analysis.cn.*`包下的类,如ChineseAnalyzer,是专为中文文本设计的分析器,它采用了诸如IK、HanLian、SmartCN等知名的中文分词算法,可以根据实际需求选择合适的策略。此外,还有一些辅助...

    Lucene.Net.Analysis.Cn.dll

    3. **查询**: 创建一个IndexSearcher对象,构造Query对象(如使用QueryParser解析用户输入的查询字符串),然后执行搜索,获取TopDocs或ScoreDoc对象。 4. **结果处理**: 遍历搜索结果,通过Document对象获取每个...

    最新版linux lucene-8.10.0.tgz

    Linux Lucene 8.10.0是Apache Lucene项目的一个关键版本,它是一个高性能、全功能的文本搜索库,广泛应用于Java开发中。Lucene提供了丰富的搜索功能,包括全文检索、高级分析器、索引优化等,为开发者构建复杂的搜索...

    最新版linux lucene-8.8.2.tgz

    最新版linux lucene-8.8.2.tgz最新版linux lucene-8.8.2.tgz最新版linux lucene-8.8.2.tgz

Global site tag (gtag.js) - Google Analytics