代码重构，lucene实例提炼

全部 Hibernate Spring Struts iBATIS 企业应用 Lucene SOA Java综合 Tomcat 设计模式 OO JBoss

浏览 2612 次

锁定老帖子主题：代码重构，lucene实例提炼精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (2)
作者	正文
xtugtf 等级: 性别: 文章: 3 积分: 160 来自: 武汉	发表时间：2009-02-23 最后修改：2009-02-24 相关推荐: JAVA开发常见单词（） Sharding-JDBC 实战（史上最全）基于大数据的软件智能化开发方法与环境数据挖掘全栈工程师指南更多相关推荐设计模式 Lucene 昨天按《lucene in Action》书中的例子动手运行了一下，也把遇到问题及相关用法作了简单总结。今天来把代码好好的梳理了一下，并对代码进行相关的重构（refactory）。 XP要求我们：测试——编码——重构——测试——编码——重构，我还是没有这种逆瀑布法来作为开发指导，而是以编码——测试——重构——编码——测试——重构来指导自己的开发，并且以keep it simple为原则来设计类。下面就将自己今天的重构过程记一下流水账。一、问题 1）需求总是变化的，如何设计系统能较好的适应需求变化是软件的设计的根本。昨天实现的两个建立索引与进行搜索的类，其职则是单一的，已符合单一职责法则，但只对文本文件进行建立索引，那么当要对word文件或pdf文件进行建立索引呢？这一需求的变化带来了，要么重新设计新类，要么在原有的Indexer类中进行参数判别来实现对不同文件来建立索引。这显然不符合开闭原则（OCP）。 2）复用，复用，还是复用。以最少的修改来适应需求的变化，来达到代码的复用是软件重构的目标。现在的两个类显然不能很好的适应用户新的需求，开发人员不能很快修改代码，快速部署来满足用户的新需求。从上面两个问题来说，我们需要对代码进行重构：满足用户新的需求（这里是预测到的需求变化点），重构达到代码的最佳复用。二、重构复用人们已提出很好的方案：依赖接口编程或抽象类编程来解决快速适应需求变化。接口类：对外提供统一稳定简单的功能接口；具体实现类：实现具体需求所要的接口功能；具体的用代码来说明： 1）接口 package com.goodwitkey.seargine.src; import java.io.File; import java.io.IOException; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.search.Hits; /* * @author Owner * / public interface Ifileseargine { //建立索引接口 public int index(File dataFilesPath,File indexFilePath)throws IOException; public void indexDirectory(IndexWriter indexwr,File dataFilesPath)throws IOException; public void indexFile(IndexWriter writer, File f)throws IOException; //提供搜索接口 public Hits search(File indexFilePath,String queryStr)throws IOException,ParseException ; } 2）文本文件的搜索一个具体实现 /* * / package com.goodwitkey.seargine.src; import java.io.File; import java.io.FileReader; import java.io.IOException; import java.util.Date; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.search.Hits; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; /* * @author Owner * / public class ItxtfilesearchgineImp implements Ifileseargine { //测试 public static void main(String[] args)throws Exception{ Ifileseargine itxtsearchgine=new ItxtfilesearchgineImp(); File dataFilePath = new File("C:\\data"); File indexFilePath= new File("C:\\indexfiles"); int indexedfileNum=itxtsearchgine.index(dataFilePath, indexFilePath); System.out.println("已索引文件个数："+indexedfileNum); itxtsearchgine.search(indexFilePath, "希拉"); } / * (non-Javadoc) * * @see com.goodwitkey.seargine.src.fileseargine#index(java.io.File, * java.io.File) / public int index(File dataFilesPath, File indexFilePath) throws IOException { // TODO Auto-generated method stub if (!dataFilesPath.exists() \|\| !dataFilesPath.isDirectory()) { throw new IOException(dataFilesPath + "don't exists or is not a directory"); } // lucene 2.4.0已将此构造函数deprecated System.out.println("*************" + indexFilePath); IndexWriter indexwr = new IndexWriter(indexFilePath, new StandardAnalyzer(), true); // 设置为true时，一定要主意建立索引文件夹不能有其它的重要文件，否则不小心会全删除掉的。 // boolean create - true to create the index or overwrite the existing // one; false to append to the existing index indexwr.setUseCompoundFile(false); indexwr.mergeFactor = 2; // 建立索引 indexDirectory(indexwr, dataFilesPath); System.out.println("*************" + indexFilePath); int indexedNum = indexwr.docCount(); indexwr.optimize(); indexwr.close(); return indexedNum; } / * (non-Javadoc) * * @see com.goodwitkey.seargine.src.fileseargine#indexDirectory() / public void indexDirectory(IndexWriter indexwr, File dataFilesPath) throws IOException { // TODO Auto-generated method stub File[] files = dataFilesPath.listFiles(); for (int i = 0; i < files.length; i++) { File f = files[i]; System.out.println(f.getName()); if (f.isDirectory()) { indexDirectory(indexwr, f);// recurse } else if (f.getName().endsWith(".txt")) { indexFile(indexwr, f); } } } / * (non-Javadoc) * * @see com.goodwitkey.seargine.src.fileseargine#indexfile() / public void indexFile(IndexWriter writer, File f) throws IOException { // TODO Auto-generated method stub if (!f.exists() \|\| !f.canRead()) { return; } System.out.println("it gets the file now"); Document doc = new Document(); doc.add(Field.Text("contents", new FileReader(f))); doc.add(Field.Keyword("filename", f.getCanonicalPath())); writer.addDocument(doc); System.out.println(f.toString()); } / * (non-Javadoc) * * @see com.goodwitkey.seargine.src.fileseargine#search(java.io.File, * java.lang.String) */ public Hits search(File indexFilePath, String queryStr)throws IOException, ParseException { // TODO Auto-generated method stub Directory fsDir = FSDirectory.getDirectory(indexFilePath, false); IndexSearcher is = new IndexSearcher(fsDir); Query query = QueryParser.parse(queryStr, "contents", new StandardAnalyzer()); long starttime = new Date().getTime(); Hits hits = is.search(query); long endtime = new Date().getTime(); System.out.println("Search the key word has elapsed " + (endtime - starttime) + "ms"); for (int i = 0; i < hits.length(); i++) { Document doc = hits.doc(i); System.out.println(doc.get("filename")); System.out.println(doc.toString()); } return hits; } } 代码运行时还要符合DIP原则见具体实现类中的main()函数，word,pdf文件的索引可以再具体实现相应的类，任务与目标也就达到了。声明：ITeye文章版权属于作者，受法律保护。没有作者书面许可不得转载。推荐链接
返回顶楼

论坛首页 → Java企业应用版

跳转论坛: