lucene几个类的含义

wutao8818

浏览: 622098 次
性别:
来自: 杭州

最近访客更多访客>>

KevinTeng

malson

rapin

shi007

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

lucene Apache


		IndexSearcher searcher = new IndexSearcher(path);
		//搜索字段。值
		Term t=new Term(searchType,searchKey);
		//term产生query
		Query q=new TermQuery(t);
		//<document,frequency>枚举对象TermDocs
		TermDocs termDocs=searcher.getIndexReader().termDocs(t);
		
		while(termDocs.next()){
			System.out.println(termDocs.freq());//searchKey在文档中出现次数
			//包含searchKey的文档。
			System.out.println(searcher.getIndexReader().document(termDocs.doc()));
			
		}

引用

Field

Store.COMPRESS//被压缩后存储
Store.YES //被存储
Store.NO //不被存储

Index.NO//不索引
Index.TOKENIZED//分词后索引
Index.UN_TOKENIZED//不分词索引
Index.NO_NORMS//不使用Analyzer来索引Field

引用

IndexWriter(String path,Analyzer a,boolean create);

IndexWriter有三个构造器，第一个参数都差不多的意思，路径。关键的第三个参数要小心一些。如果是第一次创建设置为true，如果是爹如此更新就应该设置为false，否则原来创建的索引就被删除了！



		Directory ram = new RAMDirectory();
		IndexWriter writer = new IndexWriter(ram, new StandardAnalyzer(), true,
				MaxFieldLength.UNLIMITED);

		List<String> valueList = new ArrayList<String>();
		valueList.add("Dell");
		valueList.add("eDll");
		valueList.add("ledl");
		valueList.add("llDe");

		for (int i = 0; i < 100; i++) {
			Document doc = new Document();
			Field field = new Field("name", valueList.get(Double.valueOf(
					Math.random() * 4).intValue()), Field.Store.NO,
					Field.Index.NOT_ANALYZED);
			doc.add(field);
			writer.addDocument(doc);
			System.out.println(doc);
		}
		writer.close();

		Searcherd searcherd = new Searcherd();
		searcherd.setDirectory(ram);
		searcherd.search("name", "Dell");

引用

writer.close();

关闭索引器，并将在IO缓存上的数据都写入磁盘,关闭流。如果没有关闭，索引目录除了segments外一无所有。忘记关闭，将导致索引数据滞留在缓存中，未写入磁盘，有可能导致目录锁未释放，导致下一次加入索引文件时，无法加入。

写入锁

package org.apache.lucene.index.IndexWriter

IndexWriter的构造器都调用这个方法。

  private void init(Directory d, Analyzer a, final boolean create, boolean closeDir, 
                    IndexDeletionPolicy deletionPolicy, boolean autoCommit, int maxFieldLength,
                    IndexingChain indexingChain, IndexCommit commit)
    throws CorruptIndexException, LockObtainFailedException, IOException {
    this.closeDir = closeDir;
    directory = d;
    analyzer = a;
    setMessageID(defaultInfoStream);
    this.maxFieldLength = maxFieldLength;

    if (indexingChain == null)
      indexingChain = DocumentsWriter.DefaultIndexingChain;

    if (create) {
      // Clear the write lock in case it's leftover:
      directory.clearLock(WRITE_LOCK_NAME);
    }


    Lock writeLock = directory.makeLock(WRITE_LOCK_NAME);

    if (!writeLock.obtain(writeLockTimeout)) // obtain write lock
      throw new LockObtainFailedException("Index locked for write: " + writeLock);
    this.writeLock = writeLock;                   // save it

    try {
      if (create) {
        // Try to read first.  This is to allow create
        // against an index that's currently open for
        // searching.  In this case we write the next
        // segments_N file with no segments:
        boolean doCommit;
        try {
          segmentInfos.read(directory);
          segmentInfos.clear();
          doCommit = false;
        } catch (IOException e) {
          // Likely this means it's a fresh directory
          doCommit = true;
        }

        if (autoCommit || doCommit) {
          // Always commit if autoCommit=true, else only
          // commit if there is no segments file in this dir
          // already.
          segmentInfos.commit(directory);
          synced.addAll(segmentInfos.files(directory, true));
        } else {
          // Record that we have a change (zero out all
          // segments) pending:
          changeCount++;
        }
      } else {
        segmentInfos.read(directory);

        if (commit != null) {
          // Swap out all segments, but, keep metadata in
          // SegmentInfos, like version & generation, to
          // preserve write-once.  This is important if
          // readers are open against the future commit
          // points.
          if (commit.getDirectory() != directory)
            throw new IllegalArgumentException("IndexCommit's directory doesn't match my directory");
          SegmentInfos oldInfos = new SegmentInfos();
          oldInfos.read(directory, commit.getSegmentsFileName());
          segmentInfos.replace(oldInfos);
          changeCount++;
          if (infoStream != null)
            message("init: loaded commit \"" + commit.getSegmentsFileName() + "\"");
        }

        // We assume that this segments_N was previously
        // properly sync'd:
        synced.addAll(segmentInfos.files(directory, true));
      }

      this.autoCommit = autoCommit;
      setRollbackSegmentInfos(segmentInfos);

      docWriter = new DocumentsWriter(directory, this, indexingChain);
      docWriter.setInfoStream(infoStream);
      docWriter.setMaxFieldLength(maxFieldLength);

      // Default deleter (for backwards compatibility) is
      // KeepOnlyLastCommitDeleter:
      deleter = new IndexFileDeleter(directory,
                                     deletionPolicy == null ? new KeepOnlyLastCommitDeletionPolicy() : deletionPolicy,
                                     segmentInfos, infoStream, docWriter);

      if (deleter.startingCommitDeleted)
        // Deletion policy deleted the "head" commit point.
        // We have to mark ourself as changed so that if we
        // are closed w/o any further changes we write a new
        // segments_N file.
        changeCount++;

      pushMaxBufferedDocs();

      if (infoStream != null) {
        message("init: create=" + create);
        messageState();
      }

    } catch (IOException e) {
      this.writeLock.release();
      this.writeLock = null;
      throw e;
    }

引用

限制每个Field中词条的数量

因为在分析和存储前，信息都占据于内存空间，巨大的数据会导致内存不足。
IndexWriter.setMaxFieldLength设置。立即生效。

writer.addDocument
writer.setMaxFieldLength
writer.addDocument

第三行已经生效。

分享到：

lucene几个类的含义2 | 追求代码质量

2009-05-05 11:25
浏览 1705
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论