Lucene日志建立

footman265

浏览: 121106 次
性别:
来自: 宁波

最近访客更多访客>>

gaojingsong

wangpeihu

yonghong

eap777

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Lucene

lucene CMS Apache F#thread

lucene 可以自己建立操作日志,刚在源码中发现,给个我刚建的日志文件:

IFD [Wed Dec 22 22:08:20 CST 2010; main]: setInfoStream deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@15dfd77
IW 0 [Wed Dec 22 22:08:20 CST 2010; main]: setInfoStream: dir=org.apache.lucene.store.SimpleFSDirectory@G:\package\lucene_test_dir lockFactory=org.apache.lucene.store.NativeFSLockFactory@1027b4d mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@c55e36 mergeScheduler=org.apache.lucene.index.ConcurrentMergeScheduler@1ac3c08 ramBufferSizeMB=16.0 maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 maxFieldLength=10000 index=
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]: optimize: index now 
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]: flush: now pause all indexing threads
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]:   flush: segment=_0 docStoreSegment=_0 docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false numDocs=104 numBufDelTerms=0
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]:   index before flush 
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]: DW: flush postings as segment _0 numDocs=104
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: DW:   oldRAMSize=2619392 newFlushedSize=1740286 docs/MB=62.663 new/old=66.439%
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flushedFiles=[_0.nrm, _0.tis, _0.fnm, _0.tii, _0.frq, _0.prx]
IFD [Wed Dec 22 22:08:24 CST 2010; main]: now checkpoint "segments_1" [1 segments ; isCommit = false]
IFD [Wed Dec 22 22:08:24 CST 2010; main]: now checkpoint "segments_1" [1 segments ; isCommit = false]
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: LMP: findMerges: 1 segments
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: LMP:   level 6.2247195 to 6.2380013: 1 segments
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS: now merge
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   index: _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   no more merges pending; now return
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS: now merge
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   index: _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   no more merges pending; now return
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now flush at close
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flush: now pause all indexing threads
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]:   flush: segment=null docStoreSegment=_0 docStoreOffset=104 flushDocs=false flushDeletes=true flushDocStores=true numDocs=0 numBufDelTerms=0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]:   index before flush _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]:   flush shared docStore segment _0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flushDocStores segment=_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: closeDocStores segment=_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: DW: closeDocStore: 2 files to flush to segment _0 numDocs=104
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flushDocStores files=[_0.fdt, _0.fdx]
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS: now merge
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   index: _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   no more merges pending; now return
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now call final commit()
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: startCommit(): start sizeInBytes=0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: startCommit index=_0:C104->_0 changeCount=3
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.nrm
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.tis
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.fnm
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.tii
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.frq
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.fdx
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.prx
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.fdt
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: done all syncs
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: commit: pendingCommit != null
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: commit: wrote segments file "segments_2"
IFD [Wed Dec 22 22:08:24 CST 2010; main]: now checkpoint "segments_2" [1 segments ; isCommit = true]
IFD [Wed Dec 22 22:08:24 CST 2010; main]: deleteCommits: now decRef commit "segments_1"
IFD [Wed Dec 22 22:08:24 CST 2010; main]: delete "segments_1"
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: commit: done
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: at close: _0:C104->_0

接下来是我的建立索引的类,代码大多借鉴lucene自带的demo

indexer类用来建立索引:

package my.firstest.copy;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.PrintStream;
import java.util.Date;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;

public class Indexer {
	private static File INDEX_DIR = new File("G:/package/lucene_test_dir");
	private static final File docDir = new File("G:/package/lucene_test_docs");

	public static void main(String[] args) throws Exception {
		if (!docDir.exists() || !docDir.canRead()) {
			System.out.println("索引的文件不存在!");
			System.exit(1);
		}
		int fileCount=INDEX_DIR.list().length;
		if(fileCount!=0){
			System.out.println("The old files is existed, begin to delete these files");
			File[] files=INDEX_DIR.listFiles();
			for(int i=0;i<fileCount;i++){
				files[i].delete();
				System.out.println("File "+files[i].getAbsolutePath()+"is deleted!");
			}
		}
		Date start = new Date();
		IndexWriter writer = new IndexWriter(FSDirectory.open(INDEX_DIR),
				new StandardAnalyzer(Version.LUCENE_CURRENT), true,
				IndexWriter.MaxFieldLength.LIMITED);
		writer.setUseCompoundFile(false);
		//writer.setMergeFactor(2);
		writer.setInfoStream(new PrintStream(new File("G:/package/lucene_test_log/log.txt")));
	    System.out.println("MergeFactor -> "+writer.getMergeFactor());
	    System.out.println("maxMergeDocs -> "+writer.getMergeFactor());
		indexDocs(writer, docDir);
		writer.optimize();
		writer.close();
		Date end = new Date();
		System.out.println("takes "+(end.getTime() - start.getTime())
				+ "milliseconds");
	}

	protected static void indexDocs(IndexWriter writer, File file)
			throws IOException {
		if (file.canRead()) {
			if (file.isDirectory()) {
				String[] files = file.list();
				if (files != null) {
					for (int i = 0; i < files.length; i++) {
						indexDocs(writer, new File(file, files[i]));
					}
				}
			} else {
				System.out.println("adding " + file);
				try {
					writer.addDocument(FileDocument.Document(file));
				} catch (FileNotFoundException fnfe) {
					;
				}
			}
		}
	}

}

FileDocument:

package my.firstest.copy;

import java.io.File;
import java.io.FileReader;
import org.apache.lucene.document.DateTools;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;

public class FileDocument {

	public static Document Document(File f)
			throws java.io.FileNotFoundException {

		Document doc = new Document();
		doc.add(new Field("path", f.getPath(), Field.Store.YES,
				Field.Index.NOT_ANALYZED));
		doc.add(new Field("modified", DateTools.timeToString(f.lastModified(),
				DateTools.Resolution.MINUTE), Field.Store.YES,
				Field.Index.NOT_ANALYZED));
		doc.add(new Field("contents", new FileReader(f)));
		return doc;
	}
	private FileDocument() {
	}
}

关键就是writer.setInfoStream(new PrintStream(new File("G:/package/lucene_test_log/log.txt")));

在lucene的代码里,很多地方多充斥着类似:

 if (infoStream != null) {
          message("init: hit exception on init; releasing write lock");
 }

者个message方法时:

public void message(String message) {
    if (infoStream != null)
      infoStream.println("IW " + messageID + " [" + new Date() + "; " + Thread.currentThread().getName() + "]: " + message);
  }

这里的infoStream是IndexWriter的一个属性:

private PrintStream infoStream = null;

这个属性不去设置它是为null的

可以用writer.setInfoStream(PrintStream infoStream);这个方法去设置它

设置了以后日志信息就会自动写入到自己设的文件中去了.

10
顶

5
踩

分享到：

String的理解 | Lucene 将索引Flush到硬盘

2010-12-22 22:18
浏览 7565
评论(1)
分类:编程语言
查看更多

1 楼 liuxinglanyue 2010-12-22

来踩踩。。。。

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Lucene日志建立

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Lucene日志建立

评论

发表评论

相关推荐

Lucene 将索引Flush到硬盘

最近访客更多访客>>