lunece建立索引遇到的问题

sgzlove2007

浏览: 5275 次
性别:
来自: 武汉->广州

最近访客更多访客>>

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (1)

社区版块

存档分类

2007-05 ( 1)
更多存档...

lucene Java SQL Apache thread

最近对lucene的检索进行了肤浅的学习先是把论坛里大部分的lucene的帖子看了下大致了解了下lucene 决定学习在自己测试的时候发现在对大表的创建索引时耗费的时间实在太长想通过多线程来解决对一个表的总记录数来决定创建几个线程来创建索引，结果是报错：

D:\lucene\index\_a.fnm (系统找不到指定的文件。)

Lock obtain timed out: SimpleFSLock@D:\lucene\index\write.lock

之类

难道lucene创建索引耗费是太长的问题真的没有解决，因为是初学没有去研究源代码只是在自己的博客上发发感概无意看到我篇文章的朋友不要笑我肤浅啊鼓了勇气才敢来写第一篇博客的

我测试的代码：

主函数：

java 代码

package luceneTest;
import java.io.File;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import dataConnectionPool.DataConnectionPool;
public class TMain {
private String find = " select id,name from author";
private Connection con;
private Statement stmt;
private File indexFile;
private int count ;
public TMain(File indexFile)
{
this.indexFile = indexFile;
this.count = countD();
}
public int countD()
{
int count=0;
try {
con = DataConnectionPool.getBasicDataSource().getConnection();
stmt = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);
ResultSet rs = stmt.executeQuery(find);
while(rs.next())
{
rs.last(); // 移动到最后一行
count = rs.getRow(); // 获得当前行号：此处即为最大记录数
}
System.out.println(" 表的总记录数："+count);
} catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return count;
}
public void create() throws InterruptedException
{
int num = count/10;
for(int i = 1;i<=num;i++)
{
System.out.println("main start :"+((i-1)*10+1));
CDataIndex cd = new CDataIndex(((i-1)*10+1),10,indexFile);
cd.start();
}
}
public void print(File indexFile, String markInfo)
{
FData fd = new FData(indexFile,markInfo);
fd.findData();
}
public static void main(String[] args) {
// TODO Auto-generated method stub
File indexFile = new File("D:/lucene/index");
String markInfo = "you";
TMain tm = new TMain(indexFile);
try {
tm.create();
tm.print(indexFile, markInfo);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

调用一个建立索引的类：

java 代码

package luceneTest;
import java.io.File;
import java.io.IOException;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.Date;
import org.apache.lucene.analysis.cjk.CJKAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;
import dataConnectionPool.DataConnectionPool;
public class CDataIndex extends Thread {
private Connection con;
private String find = " select id,name from author";
private PreparedStatement ps;
private int start = 0;
private int num = 0;
private File indexFile;
private boolean mark = false; //通过这个来控制是用增量索引还是全局索引
public CDataIndex(int start,int num,File indexFile)
{
this.start = start;
this.num = num;
this.indexFile = indexFile;
}
public void run()
{
System.out.println(" 开始索引！");
Date beginDate = new Date();
try {
FSDirectory fsd = FSDirectory.getDirectory(indexFile.getAbsolutePath(), mark);
if(IndexReader.isLocked(fsd)){ //这
IndexReader.unlock(fsd);;
}
IndexWriter writerF = new IndexWriter(fsd,new CJKAnalyzer());
// RAMDirectory ramD = new RAMDirectory();
// IndexWriter writerR = new IndexWriter(ramD,new StandardAnalyzer());
con = DataConnectionPool.getBasicDataSource().getConnection();
ps = con.prepareStatement(find,ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);
ResultSet rs = ps.executeQuery();
System.out.println(" start :"+start);
rs.absolute(start);
rs.previous();
while(rs.next()&&num!=0)
{
Document doc = new Document();
doc.add(new Field("id",rs.getString(1),Field.Store.YES, Field.Index.UN_TOKENIZED));
doc.add(new Field("content",rs.getString(2), Field.Store.YES,Field.Index.TOKENIZED ));
//writerR.optimize();
//writerR.addDocument(doc);
writerF.optimize();
writerF.addDocument(doc);
num--;
}
// writerR.optimize();
// writerR.close();
writerF.close();
// writerF.addIndexes(new Directory[]{ramD});
Date endDate = new Date();
System.out.println("索引耗去的时间(毫秒) ："
+ (endDate.getTime() - beginDate.getTime()));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

分享到：

2007-05-31 14:13
浏览 5275
评论(9)
论坛回复 / 浏览 (9 / 6388)
分类:非技术
查看更多

9 楼 taikeqi 2008-05-05

估计是建立索引的时候这个行纪录不存在或者被锁定用来修改。
解决办法，引入同步机制。

8 楼 taikeqi 2008-05-05

这个问题估计是当给数据库某个行建立索引的时候，数据库某行被锁定或者根本这行就不存在（正在此时被删除了）,就会造成这种现象。
解决办法，引入同步机制。

7 楼 lklkdawei 2007-09-05

原来是没有调用 IndexWriter 对象中的 writer.optimize()方法

6 楼 lklkdawei 2007-09-05

public void unDelete()throws Exception{
IndexReader reader = IndexReader.open(path);
reader.undeleteAll(); // 这里出错了
reader.close();
}

5 楼 lklkdawei 2007-09-05

public void unDelete()throws Exception{
IndexReader reader = IndexReader.open(path);
reader.undeleteAll();
reader.close();
}

4 楼 lklkdawei 2007-09-05

我在删除索引的时候出现了这样的错误：

Lock obtain timed out: SimpleFSLock@D:\lucene\index\write.lock

3 楼 marky 2007-08-21

MM的照片能不能放大一点？

2 楼 Nothingstop 2007-06-06

索引啊?我不懂，想帮忙也帮不上,呵呵！不过还是有人关注的吗
只是搞索引的人比例少点

1 楼 sgzlove2007 2007-06-02

帖子没人关注啊伤心个先被投个隐藏的愿望都没能实现啊呵呵最近在网上听人说索引时间没什么办法来解决只能在后台偷偷的运行

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论