lucene入门系列（二、建立索引）

ＷＩＮ

浏览: 272093 次
性别:
来自: 北京

最近访客更多访客>>

张中文

mg306388411

shenyouhai

csqstronger

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

lucene

lucene Apache

文档已经处理完毕，接下来，开始好似用Lucene处理相关内容。通常情况下，使用 Lucene的步骤如下所示:
(1)为要处理的内容建立索引
(2)构建查询对象
(3)在索引中查找

package com.heming.lucene.process;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;

import jeasy.analysis.MMAnalyzer;

import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;

/**
 * 为文档创建索引
 * 
 * @author 何明
 * 
 */
public class IndexProcesser {

	// 成员变量，存储创建的索引文件存放的位置
	private String INDEX_STORE_PATH = "d:\\index";

	// 创建索引
	public void createIndex(String inputDir) {

		try {

			// 以MMAnalyzer作为分词工具创建一个IndexWriter
			IndexWriter writer = new IndexWriter(INDEX_STORE_PATH,
					new MMAnalyzer(), true);

			File filesDir = new File(inputDir);

			// 取得所有需要建立索引的文件数组
			File[] files = filesDir.listFiles();

			// 遍历数组
			for (int i = 0; i < files.length; ++i) {

				// 获取文件名
				String fileName = files[i].getName();

				// 判断文件是否为txt类型的文件
				if (".txt"
						.equals(fileName.substring(fileName.lastIndexOf(".")))) {

					// 创建一个新的 Document
					Document doc = new Document();

					// 为文件名创建一个Field
					Field field = new Field("filename", files[i].getName(),
							Field.Store.YES, Field.Index.NOT_ANALYZED);

					doc.add(field);

					// 为文件内容创建一个Filed
					field = new Field("content", loadFileToString(files[i]),
							Field.Store.NO, Field.Index.NOT_ANALYZED);

					doc.add(field);

					// 把Document加入IndexWriter
					writer.addDocument(doc);

				}

			}

			// 关闭IndexWriter
			writer.close();

		} catch (Exception e) {

			e.printStackTrace();

		}

	}

	/**
	 * 从文件中把内容读出，所有的内容放在一个String 容器
	 * 
	 * @param file
	 * @return
	 */
	private String loadFileToString(File file) {

		try {

			BufferedReader br = new BufferedReader(new FileReader(file));

			StringBuffer sb = new StringBuffer();

			String line = br.readLine();

			while (null != line) {

				sb.append(line);

				line = br.readLine();

			}

			br.close();

			return sb.toString();

		} catch (Exception e) {

			e.printStackTrace();

		}

		return null;
	}

	public static void main(String[] args) {

		IndexProcesser processor = new IndexProcesser();

		processor.createIndex("d:\\test");

	}

}

我这里用的lucene版本是2.4的，分词用的MMAnalyzer

1
顶

0
踩

分享到：

lucene入门系列（三、建立搜索） | ConcurrentLinkedQueue队列替换

2009-06-27 17:44
浏览 1269
评论(1)
查看更多

1 楼 lerous 2010-03-31

如果提供对应的lucene包那更好了~ 网上下的版本可能不对应

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

lucene入门系列（二、建立索引）

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

lucene入门系列（二、建立索引）

评论

发表评论

相关推荐

lucene入门系列（三、建立搜索）

lucene入门系列（一、文档预处理）

庖丁解牛的Lucene2.4全文搜索代码

基于apache Lucene的mp3搜索器

最近访客更多访客>>