全文搜索-Lucene

jin8000608172

浏览: 141961 次
性别:
来自: 深圳

最近访客更多访客>>

hbyufan

polly1216

zzc125

小小书僮

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

通用技术

lucene java 全文搜索

package com.lucene.helloworld;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriter.MaxFieldLength;
import org.apache.lucene.queryParser.MultiFieldQueryParser;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Filter;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.junit.Test;

import com.lucene.utils.LuceneUtils;

public class HelloWorld {
	private String filePath="F:\\tasks\\workspace2\\LucenePro\\luceneDatasource\\jincm_中国.txt";
	private String indexPath="F:\\tasks\\workspace2\\LucenePro\\luceneIndex";
	private Analyzer analyzer=new StandardAnalyzer();
	@Test
	public void createIndex() throws Exception{
		Document doc=LuceneUtils.file2Document(filePath);
		IndexWriter indexWriter=new IndexWriter(indexPath, analyzer,true,MaxFieldLength.LIMITED);
		indexWriter.addDocument(doc);
		indexWriter.close();
	}

	@Test
	public void search()throws Exception {
		String queryString="jincm";
		String[] fields={"name","content"};
		QueryParser queryParser=new MultiFieldQueryParser(fields, analyzer);
		Query query=queryParser.parse(queryString);
		IndexSearcher indexSearcher=new IndexSearcher(indexPath);
		Filter filter=null;
		TopDocs topDocs=indexSearcher.search(query,filter,10000);
		System.out.println("总共有【" + topDocs.totalHits + "】条匹配结果");
		for(ScoreDoc scoreDoc:topDocs.scoreDocs){
			int docIndex=scoreDoc.doc;
			Document doc=indexSearcher.doc(docIndex);
			System.out.println("<------------------------------");
			System.out.println(doc.get("name"));
			System.out.println(doc.get("content"));
			System.out.println(doc.get("size"));
			System.out.println(doc.get("path"));
			System.out.println("------------------------------>");
		}
		
	}
}

package com.lucene.utils;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStreamReader;

import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Field.Index;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.document.NumberTools;

public class LuceneUtils {
	public static Document file2Document(String filePath){
		File file=new File(filePath);
		Document doc=new Document();
		doc.add(new Field("name", file.getName(), Store.YES, Index.ANALYZED));
		doc.add(new Field("content",readFileContent(file), Store.YES, Index.ANALYZED));
		doc.add(new Field("size", NumberTools.longToString(file.length()), Store.YES, Index.NOT_ANALYZED));
		doc.add(new Field("path", file.getAbsolutePath(), Store.YES, Index.NOT_ANALYZED));
		return doc;
	}
	public static String readFileContent(File file){
		try {
			BufferedReader reader=new BufferedReader(new InputStreamReader(new FileInputStream(file)));
			StringBuffer content=new StringBuffer();
			for(String line=null;(line=reader.readLine())!=null;){
				content.append(line);
			}
			return content.toString();
		} catch (Exception e) {
			// TODO Auto-generated catch block
			throw new RuntimeException();
		}
	}
}

LucenePro.zip (952.9 KB)
下载次数: 13

LuceneDemo.zip (1.8 MB)
下载次数: 12

分享到：

二分查找算法 | 你的情商高吗？

2013-01-09 15:16
浏览 1330
评论(0)
分类:研发管理
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

全文搜索-Lucene

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

全文搜索-Lucene

评论

发表评论

相关推荐

学习sharding-jdbc（二）之spring+mybatis+sharding-jdbc整合

redis分布式锁-SETNX实现

使用Maven构建多模块项目

java 加解密（3DES）

加解密

浅谈Java中的hashcode方法

java之jvm学习笔记十三(jvm基本结构)

每天进步一点点——五分钟理解一致性哈希算法(consistent hashing)

ConcurrentHashMap原理分析

系统日志logback

大数据量高并发的数据库优化

jvm组成

MySQL事务隔离级别详解

Java 自动装箱与拆箱

面试问题汇总

hadoop资料大全-欢迎来下载

hadoop资料大全-欢迎来下载

线程间通信

初学Java多线程：线程的生命周期

线程的生命周期

最近访客更多访客>>