3.3-2011.7
Highlights of the Lucene release include:
- The spellchecker
module now includes suggest/auto-complete functionality,
with three implementations: Jaspell, Ternary Trie, and Finite State.
- Support for merging results from multiple shards
, for both "normal"
search results (TopDocs.merge) as well as grouped results using the
grouping module (SearchGroup.merge, TopGroups.merge).
- An optimized implementation of KStem, a less aggressive stemmer
for English
- Single-pass grouping implementation based on block document indexing.
- Improvements to MMapDirectory (now also the default implementation
returned by FSDirectory.open on 64-bit
Linux
).
-
NRTManager
simplifies handling near-real-time search with multiple
search threads, allowing the application to control which indexing
changes must be visible to which search requests.
-
TwoPhaseCommitTool
facilitates performing a multi-resource
two-phased commit, including IndexWriter.
- The default merge policy, TieredMergePolicy, has a new method
(set/getReclaimDeletesWeight) to control how aggressively it
targets segments with deletions, and is now more aggressive than
before by default.
-
PKIndexSplitter
tool splits an index by a mid-point term.
3.2-2011-6
- A new grouping
module, under lucene/contrib/grouping, enables
search results to be grouped by a single-valued indexed field 原来这版本才出来
- A new IndexUpgrader
tool fully converts an old index to the
current format.
- A new Directory implementation, NRTCachingDirectory
, caches small
segments in RAM, to reduce the I/O load for applications with fast
NRT reopen rates.
- A new Collector implementation, CachingCollector
, is able to
gather search hits (document IDs and optionally also scores) and
then replay them. This is useful for Collectors that require two
or more passes to produce results.
- Index a document block using IndexWriter's new addDocuments
or
updateDocuments
methods. These experimental APIs ensure that the
block of documents will forever remain contiguous in the index,
enabling interesting future features like grouping and joins.
- A new default merge policy, TieredMergePolicy
, which is more
efficient due to being able to merge non-contiguous(邻近的,连续) segments.
See http://s.apache.org/merging
for details.
- NumericField is now returned correctly when you load a stored
document (previously you received a normal Field back, with the
numeric value converted string).
- Deleted terms are now applied during flushing to the newly flushed
segment, which is more efficient than having to later initialize a
reader for that segment.
3.1-2011.3
ConcurrentMergeScheduler
is more careful about setting priority of
merge threads.
ReusableAnalyzerBase
makes it easier to reuse TokenStreams
correctly.
ConstantScoreQuery
now allows directly wrapping a Query.
IndexWriter
is now configured with a new separate builder API,
IndexWriterConfig. You can now control IndexWriter's previously
fixed internal thread limit by calling setMaxThreadStates.
IndexWriter.getReader is replaced by IndexReader
.open(IndexWriter)
MultiSearcher is deprecated; ParallelMultiSearcher
has been
absorbed directly into IndexSearcher.
- New TotalHitCountCollector
just counts total number of hits.
-
ReaderFinishedListener
API enables external caches to evict entries
once a segment is finished.
据说是已经实现了grouping,但还是没说出来。。。
3.0.3-2010-12
a memory leak in IndexWriter
exacerbated by frequent commits
这也说明还不是很稳定
fixed:NumericRangeQuery
/ NumericRangeFilter
sometimes returning incorrect results
with bounds near Long.MIN_VALUE
and Long.MAX_VALUE
various thread safety issues
3.0.2-2010-6
Fixed memory leaks in IndexWriter
when large documents are indexed.
It also uses now shared memory pools for term vectors and stored fields.
IndexWriter
now releases Fieldable
s and
Reader
s on close
.
Performance improvements in ParallelMultiSearcher
(3.0.2 only).
注意:2.x与3.x系列只是编译版本不同而已,前者是jdk1.4,后才是5.0
最近的版本更新得频繁,感觉稳定性上不是那么有信心。。
分享到:
相关推荐
《Lucene 3.3基础功能与IK分词器3.2.8使用详解》 Apache Lucene是一款高性能、全文本检索库,被广泛应用于各种搜索引擎的开发中。本文将重点探讨Lucene 3.3版本的基础功能及其与IK分词器3.2.8的集成使用方法。 一...
lucene3.3的全部jar包ant-1.7.1.jar ant-junit-1.7.1.jar commons-beanutils-1.7.0.jar commons-collections-3.1.jar commons-compress-1.1.jar commons-digester-1.7.jar commons-logging-1.0.4.jar icu4j-4_8.jar ...
在"lucene3.3+全部jar包"这个压缩文件中,通常包含以下 JAR 文件: - `lucene-core-3.3.0.jar`:Lucene 的核心库,包含了索引和搜索的基本功能。 - `lucene-analyzers-3.3.0.jar`:提供了多种语言的文本分析器,...
本篇将详细解析Lucene 3.3 API,帮助开发者深入理解并有效地利用这个强大的搜索引擎库。 ### 1. Lucene的核心组件 Lucene的核心组件包括索引(Indexing)、查询(Querying)和搜索(Searching)等部分。 - **索引...
lucene 3.3 core的源码包 lucence3.3_src.zip 使用eclipse展开引用的core核心包 然后随便点进去一个方法里click attatch source然后选择这个zip包,即可/
资源为庖丁解牛分词法的最新源码以及生成的jar包,支持最新的Lucene3.4以及Lucene3.0以上版本。Jar包为本地生成,大家也可以到SVN上检出自己生成,另外庖丁解牛分词法的使用Demo我会接下来上传一份,欢迎分享。
lucene,lucene教程,lucene讲解。 为了对文档进行索引,Lucene 提供了五个基础的类 public class IndexWriter org.apache.lucene.index.IndexWriter public abstract class Directory org.apache.lucene.store....
* Snowball BSD license header has been added to the Java classes to avoid having RAT adding new ASL headers. IMPORTANT NOTICE ON BACKWARDS COMPATIBILITY! An index created using the Snowball module...
lucene3.0 lucene3.0 lucene3.0 lucene3.0 lucene3.0
First, you should download the latest Lucene distribution and then extract it to a working directory. You need four JARs: the Lucene JAR, the queryparser JAR, the common analysis JAR, and the Lucene ...
【Lucene 4.7.0 全套JAR包详解】 Lucene是一个开源全文搜索引擎库,由Apache软件基金会开发并维护。它提供了一个高级、灵活的文本搜索API,允许开发者轻松地在应用程序中实现复杂的搜索功能。这次提供的“lucene-...
本压缩包包含的是Lucene 3.5.0版本的全部源码,对于想要深入理解Lucene工作原理、进行二次开发或者进行搜索引擎相关研究的开发者来说,是一份非常宝贵的学习资源。 Lucene 3.5.0是Lucene的一个重要版本,它在3.x...
《Lucene in Action》是关于Apache Lucene的权威指南,这本书深入浅出地介绍了全文搜索引擎的构建和优化。Lucene是一个高性能、全文本搜索库,它允许开发人员在应用程序中轻松实现复杂的搜索功能。这本书主要面向...
Lucene是一款强大的全文搜索引擎库,广泛应用于各种数据检索场景。在C#环境下,利用Lucene进行时间区间搜索是提高数据检索效率和精确度的重要手段。本篇将深入探讨如何在C#中实现Lucene的时间区间查询匹配,以及涉及...
《Annotated Lucene 中文版 Lucene源码剖析》是一本深入探讨Apache Lucene的书籍,专注于源码解析,帮助读者理解这个强大的全文搜索引擎库的工作原理。Lucene是一款开源的Java库,它提供了高效的文本搜索功能,被...
《深入理解Lucene 4.4.0代码库与Java核心技术》 在IT领域,Lucene是一个非常重要的开源全文搜索引擎库,它为开发者提供了强大的文本分析、索引和搜索功能。这里我们关注的是Lucene的4.4.0版本,通过解压"lucene-...
【Lucene 简介】 Lucene 是一个强大的开源全文搜索库,由 Java 编写,主要用于为应用程序添加全文检索功能。它不是一个完整的全文搜索引擎应用,而是一个工具包,允许开发者将其集成到自己的软件中,以实现高效、...
在IT领域,搜索引擎技术是至关重要的,而Lucene作为一个开源全文搜索引擎库,广泛应用于各种文本检索系统中。本文将深入探讨Lucene示例中的BM25相似度计算,旨在帮助初学者理解如何利用Lucene 4.7.1版本构建索引、...
### Lucene3源码分析知识点概述 #### 一、全文检索的基本原理 ##### 1. 总论 全文检索系统是一种高效的信息检索技术,能够帮助用户在海量文档中快速找到包含特定关键词的信息。Lucene是Java领域内最受欢迎的全文...
**Lucene 2.0 API 和 Lucene 3.0 API 深度解析** Lucene 是一个由 Apache 软件基金会开发的全文搜索引擎库,它为开发者提供了在 Java 应用程序中实现高性能、可扩展的全文搜索功能的能力。Lucene 的 API 设计得相当...