lucene的简单使用

fushengfei

浏览: 211290 次
性别:
来自: 北京

最近访客更多访客>>

Sobfist

kidlovec

413899327

gaoshaoye

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

搜索引擎

lucene Apache

1、下载地址：http://archive.apache.org/dist/lucene/java/

2、往项目中导入相关包。

3、构建第一个lucene项目，该项目的功能是把文档进行索引，索引之后进行搜索。

4、代码：

HelloWord.java

public class HelloWord {
	String filePath = "磁盘文件路径";
	String indexPath = "索引结果的存放路径";
//	Analyzer analyzer = new StandardAnalyzer();// 创建一个分词器
	Analyzer analyzer = new MMAnalyzer();// 创建一个分词器
	
    @Test
	public void createIndex() throws Exception {//创建索引
		Document doc = File2DocumentUtils.file2Document(filePath);;// 将磁盘文件转成Document对象
		IndexWriter indexWriter = new IndexWriter(indexPath, analyzer, true,
				MaxFieldLength.LIMITED);
	    indexWriter.addDocument(doc);
	    indexWriter.close();
	}
    @Test
	public void search() throws Exception{//搜索功能
    	  String queryString="room";
    	
    	// 1，把要搜索的文本解析为 Query
    	   String[] fields={"name","content"};
    	   QueryParser queryParse=new MultiFieldQueryParser(fields,analyzer);//解析器
    	   Query query=queryParse.parse(queryString);//解析搜索内容
    	// 2，进行查询
    	   IndexSearcher indexSearcher=new IndexSearcher(indexPath);//搜索器
    	   Filter filter=null;
    	   TopDocs topDocs= indexSearcher.search(query, filter, 10000);
    	  	// 3，打印结果
    	   System.out.println("总共有【" + topDocs.totalHits + "】条匹配结果");
    	   for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
    		   int docSn = scoreDoc.doc; // 文档内部编号
    		    Document doc = indexSearcher.doc(docSn); // 根据编号取出相应的文档
    		    File2DocumentUtils.printDocumentInfo(doc); // 打印出文档信息  
    	   }
	}
}

File2DocumentUtils.java

public class File2DocumentUtils {

	public static Document file2Document(String filePath) {
	     File file=new File(filePath);
	     Document doc=new Document();
	     doc.add(new Field("name",file.getName(),Store.YES,Index.ANALYZED));
	     doc.add(new Field("content",readFileContent(file),Store.YES,Index.ANALYZED));
	     doc.add(new Field("size",NumberTools.longToString(file.length()),Store.YES,Index.ANALYZED));
	     doc.add(new Field("path",file.getAbsolutePath(),Store.YES,Index.ANALYZED));
	     return doc;
	}

	private static String readFileContent(File file){
       try {
		BufferedReader bufer=new BufferedReader(new InputStreamReader(new FileInputStream(file))); 
		   StringBuffer buf=new StringBuffer();
		   String str="";
		
		 while( (str= bufer.readLine())!=null)
		 {
			 buf.append(str).append("\n");
		 }
	     return buf.toString();
	} catch (Exception e) {
		throw new RuntimeException(e);
	} 
		
	}

	public static void printDocumentInfo(Document doc) {
		System.out.println("------------------------------");
		System.out.println("name     = " + doc.get("name"));
		System.out.println("content  = " + doc.get("content"));
		System.out.println("size     = " + NumberTools.stringToLong(doc.get("size")));
		System.out.println("path     = " + doc.get("path"));		
	}

}

分享到：

lucene学习笔记 | jsp知识点总结

2010-10-15 14:53
浏览 914
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

lucene的简单使用

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

lucene的简单使用

评论

发表评论

相关推荐

nutch中文分词(通过插件的方式)

nutch中文分词（修改源码的方式）

lucene学习笔记

最近访客更多访客>>