- 浏览: 184504 次
- 性别:
- 来自: 深圳
文章分类
- 全部博客 (103)
- Java综合 (19)
- java模式 (1)
- java 包详解 (8)
- 需要阅读的书目 (1)
- Json (1)
- MySQL (2)
- zkoss (2)
- svn (1)
- JavaScript (1)
- html (1)
- 试题集锦 (6)
- 试题集锦_poj (1)
- Vim 操作 (2)
- Linux 操作 (5)
- NS2 学习 (2)
- 网络 (4)
- c/c++ (7)
- WNS - Wired Network Simulator (1)
- 网络信息体系结构 (16)
- MIPS (1)
- Java图形化编程 (2)
- 数据结构 (1)
- 数学 (3)
- 爬虫 (1)
- 搜索引擎 (1)
- NetFPGA (1)
- Directshow (1)
- 小软件 (2)
- FFMPEG (1)
- Windows Socket 网络编程 (5)
- Git (1)
- IntelliJ IDEA (0)
- Plone (1)
- Python (1)
最新评论
-
不要叫我杨过:
受教了,高手
Heritrix架构分析 -
springaop_springmvc:
apache lucene开源框架demo使用实例教程源代码下 ...
Lucene 3.0.2 使用入门 -
zxw961346704:
值得学习的算法
Java 计算器 -
medicine:
Thread.sleep(1000); 会使线程进入 TIM ...
Java.lang.Thread 和 Java.lang.ThreadGroup -
tangzlboy:
嗯,不错!收藏。
Java 入门
在Lucene 3.0.2中,在Field 、 Document 和 Query中都有setBoost接口,但是为什么在Query中设置boost值,在搜索结果中却没有任何变化呢?求高人指教啊。。。 代码如下:
在没有query.setBoost(2);的情况下,结果如下:
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
0.625 = fieldNorm(field=bookname, doc=0)
在有query.setBoost(2);的情况下,结果如下:
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
0.625 = fieldNorm(field=bookname, doc=0)
在Field和Document中setBoost的值,在搜索结果中是有变化的。(因为Field都相同,没有进行尝试,但和Document一样,都是将boost值设置到了索引中)。代码如下:
在没有doc1.setBoost(4);这一行的时候,结果如下:
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
0.625 = fieldNorm(field=bookname, doc=0)
在有doc1.setBoost(4);这一行的时候,结果如下:
1.0
thinking in java 2.5
2.5 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
2.5 = fieldNorm(field=bookname, doc=0)
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
package com.eric.lucene; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; public class ScoreSortTest { public static void main(String[] args) throws Exception { Directory dir = new RAMDirectory(); IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), true, IndexWriter.MaxFieldLength.LIMITED); Document doc1 = new Document(); Document doc2 = new Document(); Document doc3 = new Document(); doc1.add(new Field("bookname","thinking in java", Field.Store.YES, Field.Index.ANALYZED)); doc2.add(new Field("bookname","thinking in java java java", Field.Store.YES, Field.Index.ANALYZED)); doc3.add(new Field("bookname","thinking in c++", Field.Store.YES, Field.Index.ANALYZED)); writer.addDocument(doc1); writer.addDocument(doc2); writer.addDocument(doc3); writer.optimize(); writer.close(); IndexSearcher searcher = new IndexSearcher(dir); Query query = new TermQuery(new Term("bookname","java")); // query.setBoost(2); TopScoreDocCollector collector = TopScoreDocCollector.create(100, false); searcher.search(query, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; for(int i=0; i<hits.length;i++){ Document doc = searcher.doc(hits[i].doc); System.out.println(doc.getBoost()); System.out.print(doc.get("bookname") + "\t\t"); System.out.println(hits[i].score); System.out.println(searcher.explain(query, hits[i].doc)); } } }
在没有query.setBoost(2);的情况下,结果如下:
引用
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
0.625 = fieldNorm(field=bookname, doc=0)
在有query.setBoost(2);的情况下,结果如下:
引用
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
0.625 = fieldNorm(field=bookname, doc=0)
在Field和Document中setBoost的值,在搜索结果中是有变化的。(因为Field都相同,没有进行尝试,但和Document一样,都是将boost值设置到了索引中)。代码如下:
package com.eric.lucene; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; public class ScoreSortTest { public static void main(String[] args) throws Exception { Directory dir = new RAMDirectory(); IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), true, IndexWriter.MaxFieldLength.LIMITED); Document doc1 = new Document(); Document doc2 = new Document(); Document doc3 = new Document(); doc1.add(new Field("bookname","thinking in java", Field.Store.YES, Field.Index.ANALYZED)); doc1.setBoost(4); doc2.add(new Field("bookname","thinking in java java java", Field.Store.YES, Field.Index.ANALYZED)); doc3.add(new Field("bookname","thinking in c++", Field.Store.YES, Field.Index.ANALYZED)); writer.addDocument(doc1); writer.addDocument(doc2); writer.addDocument(doc3); writer.optimize(); writer.close(); IndexSearcher searcher = new IndexSearcher(dir); Query query = new TermQuery(new Term("bookname","java")); TopScoreDocCollector collector = TopScoreDocCollector.create(100, false); searcher.search(query, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; for(int i=0; i<hits.length;i++){ Document doc = searcher.doc(hits[i].doc); System.out.println(doc.getBoost()); System.out.print(doc.get("bookname") + "\t\t"); System.out.println(hits[i].score); System.out.println(searcher.explain(query, hits[i].doc)); } } }
在没有doc1.setBoost(4);这一行的时候,结果如下:
引用
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
1.0
thinking in java 0.625
0.625 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
0.625 = fieldNorm(field=bookname, doc=0)
在有doc1.setBoost(4);这一行的时候,结果如下:
引用
1.0
thinking in java 2.5
2.5 = (MATCH) fieldWeight(bookname:java in 0), product of:
1.0 = tf(termFreq(bookname:java)=1)
1.0 = idf(docFreq=2, maxDocs=3)
2.5 = fieldNorm(field=bookname, doc=0)
1.0
thinking in java java java 0.8660254
0.8660254 = (MATCH) fieldWeight(bookname:java in 1), product of:
1.7320508 = tf(termFreq(bookname:java)=3)
1.0 = idf(docFreq=2, maxDocs=3)
0.5 = fieldNorm(field=bookname, doc=1)
发表评论
-
apache-solr 使用
2010-11-22 11:39 1010Solr是一个独立的企业级搜索应用服务器,它对外提供类似于We ... -
HTML Parser 使用 例子
2010-11-22 11:36 1322htmlparser是一个纯的java写的html解析的库,它 ... -
IK Analyzer Demo
2010-11-17 20:39 1249IK Analyzer 分词器的使用Demo,也是看了我一个朋 ... -
Lucene 3.0.2 代码 分析
2010-11-16 15:19 5326持续更新1 Document 和 Field2 IndexWr ... -
Heritrix 抓取 高级篇
2010-11-16 10:32 5051使用Heritrix进行抓取网页,有半天阅读我之前博客的话,很 ... -
Lucene 3.0.2 使用入门
2010-11-15 20:15 2633最近在做一个大作业,主要是用到Heritrix 1.14.4 ... -
网络信息体系结构作业 2
2010-11-07 11:06 899内容:Inverted Index and Retrieval ... -
前三章的练习题
2010-11-06 10:19 928下面是前三章的习题 -
网络信息体系结构 内容
2010-11-01 16:47 9041.背景知识要求 线性代数,概率论和数理统计 ... -
网络信息体系结构作业1
2010-10-19 10:19 1915要求如下: 内容:crawler和graph link an ... -
heritrix多线程抓取--好使
2010-10-19 10:08 3012最近作业中有个需要用Heritrix抓包的任务,不过抓起来,我 ... -
Heritrix架构分析
2010-10-09 17:02 4291通过简单的抓取演示,有必要对Heritrix框架的架构进行一些 ... -
Web Crawler的体系结构
2010-10-09 11:28 1611以下三张图片说明了网络爬虫的体系结构。 -
Heritrix使用入门
2010-10-08 14:43 1834通过第一篇的Eclipse配置 ... -
Eclipse 配置 Heritrix 1.14.4
2010-10-05 15:37 4200在其他帖子上看到有Eclipse 配置 Heritrix 1. ...
相关推荐
lucene-3.0.2.zip lucene-3.0.2.zip
《深入理解Lucene 3.0.2:核心与演示》 在信息技术领域,搜索引擎的构建是至关重要的一环,而Lucene作为一个开源全文检索库,为开发者提供了强大的文本搜索功能。这里我们主要聚焦于Lucene 3.0.2版本,通过分析其...
lucene3.0.2包含lucene-analyzers-3.0.2.jar,lucene-core-3.0.2.jar,lucene-highlighter-3.0.2.jar,lucene-memory-3.0.2.jar等jar包使用lucene实现分词搜索
lucene library. lucene-demos-XX.jar The compiled simple example code. luceneweb.war The compiled simple example Web Application. contrib/* Contributed code which extends and enhances Lucene, but...
lucene-memory-3.0.2.jar,lucene高亮显示中不可少的jar包lucene-memory-*.jar
博客上的例子用到的LUCENE3.0.2版本的jar包
包括了lucene-core-3.0.2.jar,IKAnalyzer3.2.0Stable.jar,lucene-analyzers-2.3.0.jar,lucene-highlighter-3.0.2-sources.jar,lucene-memory-3.0.2.jar,最新的停词字典stopword.rar
lucene-highlighter.jar lucene-highlighter.jar
《Apache Lucene 3.0.2:全文搜索的核心与深度探索》 Apache Lucene 是一个高度可扩展的开源全文搜索引擎库,它为开发者提供了在应用程序中实现复杂搜索功能的基础框架。版本3.0.2是这个项目的一个里程碑,包含了...
比如,英文的 LowerCaseFilter 会将所有词元转换为小写,避免大小写敏感的问题。 3. 特殊语言分析器:针对不同语言的特点,Lucene提供了特定的分析器,如德语分析器(GermanAnalyzer)、法语分析器(FrenchAnalyzer...
《深入理解Lucene分析器库:lucene-analysis.jar解析》 在信息检索和搜索引擎领域,Apache Lucene是一个广泛使用的开源全文检索库。它的核心功能包括文档的索引、搜索以及相关的高级特性。其中,"lucene-analysis....
4. `org.apache.lucene.search.Query` 和 `org.apache.lucene.queryparser.classic.QueryParser`:理解查询的构建和解析过程。 5. `org.apache.lucene.search.Searcher`:研究搜索过程,特别是如何计算相关性和返回...
《深入解析Lucene 3.0.2:Java全文搜索引擎的核心技术》 Lucene是一个开源的、基于Java的全文搜索引擎库,它为开发者提供了构建高效、可扩展的搜索功能所需要的核心工具。在3.0.2这个版本中,Lucene已经经过了多次...
lucene-3.0.2-src.zip 源码
Lucene 3.0.2 API DOC CHM 是开发的必备工具之一
lucene-demos-3.0.2.jar 搜索引擎
例如,`org.apache.lucene.analysis.cn.*`包下的类,如ChineseAnalyzer,是专为中文文本设计的分析器,它采用了诸如IK、HanLian、SmartCN等知名的中文分词算法,可以根据实际需求选择合适的策略。此外,还有一些辅助...
3. **查询**: 创建一个IndexSearcher对象,构造Query对象(如使用QueryParser解析用户输入的查询字符串),然后执行搜索,获取TopDocs或ScoreDoc对象。 4. **结果处理**: 遍历搜索结果,通过Document对象获取每个...
Linux Lucene 8.10.0是Apache Lucene项目的一个关键版本,它是一个高性能、全功能的文本搜索库,广泛应用于Java开发中。Lucene提供了丰富的搜索功能,包括全文检索、高级分析器、索引优化等,为开发者构建复杂的搜索...
最新版linux lucene-8.8.2.tgz最新版linux lucene-8.8.2.tgz最新版linux lucene-8.8.2.tgz