Lucene: Introduction to Lucene (Part IV)

DavyJones2010

浏览: 157028 次
性别:
来自: 杭州

最近访客更多访客>>

zhihaoma

xiaoji123pt

dingdaxin

vv404725784

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Lucene

IndexReader Lucene

1. The lifecycle of IndexReader and IndexWriter

1) The open and close operations for reader/writer are at high cost.

2) Especially for IndexReader, time consumption of these operations are high.

3) So by convention, IndexReader is set as singleton in application.

package edu.xmu.lucene.Lucene_ModuleOne;

import java.io.File;
import java.io.IOException;

import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

/**
 * Hello world!
 * 
 */
public class App
{

	private Directory dir = null;
	private static IndexReader reader = null;

	public App()
	{
		try
		{
			dir = FSDirectory.open(new File("E:/LuceneIndex"));
			reader = IndexReader.open(dir);
		} catch (IOException e)
		{
			e.printStackTrace();
		}
	}

	public IndexSearcher getSearcher()
	{
		IndexSearcher searcher = new IndexSearcher(reader);
		return searcher;
	}

	/**
	 * Search
	 * 
	 */
	public void search()
	{
		IndexSearcher searcher = getSearcher();
		TermQuery query = new TermQuery(new Term("name", "Davy"));
		TopDocs topDocs;
		try
		{
			topDocs = searcher.search(query, 10);
			for (ScoreDoc scoreDoc : topDocs.scoreDocs)
			{
				Document doc = searcher.doc(scoreDoc.doc);
				String score = doc.get("score");
				String date = doc.get("date");
				float boost = doc.getBoost();
				System.out.println("Score = " + score + ", Date = " + date
						+ ", Boost = " + boost);
			}
		} catch (IOException e)
		{
			e.printStackTrace();
		} finally
		{
			try
			{
				// We only have to close searcher and don't have to close
				// reader.
				searcher.close();
			} catch (IOException e)
			{
				e.printStackTrace();
			}
		}
	}
}

2. Other quesions:

1) As there is only one reader that is created through the whole application.

2) Will the changes be detected and reflected to reader whenever an IndexWriter change the index file?

--> Any update/delete operations will not affect the reader that created before such operation executed.

package edu.xmu.lucene.Lucene_ModuleOne;

import org.junit.Test;

/**
 * Unit test for simple App.
 */
public class AppTest
{
	@Test
	public void testSearch()
	{
		App app = new App();
		for (int i = 0; i < 5; i++)
		{
			app.search();
		}
	}
}

Comments: If we execute test case as above. There will be no affect whatever the count of search operation be excuted. Because during the whole process, there is only one reader created. There will not be some other index files created.

3) How can we enable reader to detect the change of index during the whole lifecycle of this reader?

	public IndexSearcher getSearcher()
	{

		try
		{
			if (reader == null)
			{
				reader = IndexReader.open(dir);
			} else
			{
				if (null != IndexReader.openIfChanged(reader))
				{
					reader = IndexReader.openIfChanged(reader);
				}
			}
		} catch (CorruptIndexException e)
		{
			e.printStackTrace();
		} catch (IOException e)
		{
			e.printStackTrace();
		}

		IndexSearcher searcher = new IndexSearcher(reader);
		return searcher;
	}

Comments:

1) In class IndexReader: static IndexReader openIfChanged(IndexReader oldReader)

-> If the index has chaged since the provided reader was opend, open and return a new reader, else, return null.

2) By using this method, we can update reader in real time.

3) By convention, during the whole lifecycle of the application, there will be only one IndexReader.

4) But some other application, writer is required to be singleton.

-> So how can we commit the change we made to the index file as we cannot close the writer?

-> Use writer.commit();

-> If we don't commit, then the index file will not change, the modification we did in IndexWriter is invalid.

5) As we can see, during the procession of IndexReader.openIfChanged, there will be new IndexReader created if there is some change in index.

-> So what about the old reader? As the old reader hasn't been closed yet.

6) We can find out that during the procession of delete.

-> We can not only use writer.deleteDocument(new Term(key, value)); but also use reader.deleteDocument(new Term(key, value));

-> Remember to the easiest way for reader to commit is reader.close();

-> So what's the difference of the two approaches?

-> By default, reader is read only. Use IndexReader.open(dir, false) to make it writable.

-> If we use reader to modify the index, actually it will create a writer and then use the writer to execute the modification.

-> The benefit is that the modification information will reflect to reader in real time and we don't have to use IndexReader.openIfChanged(IndexReader reader);

-> But it would be a little difficult to submit changes using reader.

-> So by convention, we don't use this approach.

分享到：

Java Logging Techniques Summary(Introduc ... | Lucene: Introduction to Lucene (Part III ...

2013-05-23 08:47
浏览 934
评论(0)
分类:行业应用
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论