更新索引策略之一（非繁忙时刻更新）

longzhun

浏览: 376604 次
性别:
来自: 北京

最近访客更多访客>>

popchild

lp164042318

promiseloney

必逍遥

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Lucene

半夜更新：也可以叫做非繁忙时段更新。

思路:从上次遍历的最后一条开始.我们需要保存一个数据库中的id,也就是每次遍历后的最大id，以方便下次遍历的时候从这个id开始，判断只去比这个id大的记录进行更新索引。

1.创建一个txt文件

2.第一次遍历，将遍历后的最大id存入txt文件

3.以后每次遍历，从txt文件中取出id，并在遍历数据库结束后更新最大id.

4.定制任务，要求每天凌晨2点运行此程序.

定制任务的方式

1.打开一个网页,网页中有一段js代码，判断时间如果是凌晨2点那么将页面跳转到我们的action路径，从而启动任务。

2.Spring提供的quartz来进行任务定制

代码如下：

	<!--定制任务  -->
	<bean id="schedulerFactoryBean" class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
		<property name="triggers">
			<list>
				<ref bean="doTime"/>
			</list>
		</property>
		
		<property name="configLocation" value="classpath:quartz.properties"/>
	</bean>
	
	
	<!-- 定义触发时间 -->
	<bean id="doTime" class="org.springframework.scheduling.quartz.CronTriggerBean">
		<property name="jobDetail">
			<ref bean="ci"/>
		</property>
		
		<property name="cronExpression">
			<value>0/5 * * * * ?</value>
		</property>
	</bean>
	
	<!--指定时间工作的具体类  -->
	<bean id="ci" class="org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">
		<property name="targetObject" ref="createIndex" />
		<property name="targetMethod" value="doJob" />
		<property name="concurrent" value="false" /> <!--将并发设置为false-->
	</bean>

org.quartz.scheduler.instanceName = TestScheduler
org.quartz.scheduler.instanceId = AUTO

public class CreateIndex{
	// 注入manager层
	private ArticleManager articleManager;

	public void setArticleManager(ArticleManager articleManager) {
		this.articleManager = articleManager;
	}
	public void doJob()throws Exception{
		System.out.println("任务执行！");
		this.createIndex();
	}
	public void createIndex() throws Exception {
		// 实例化分词器,使用的是中文分词器
		Analyzer analyzer = new PaodingAnalyzer();
		// 指定要保存的文件路径并保存到FSDirectory中
		System.out.println(URLDecoder.decode(AnalyzerAction.class
				.getResource("/date/index/article/").toString(),"UTF-8").substring(6));
		FSDirectory directory = FSDirectory.getDirectory(URLDecoder.decode(AnalyzerAction.class
				.getResource("/date/index/article/").toString(),"UTF-8").substring(6));
		// true表示覆盖原来已经创建的索引,如果是false表示不覆盖，而是继续添加索引
		IndexWriter writer = new IndexWriter(directory, analyzer, true);

		String articleId = this.readText();
		if(null == articleId || "".equals(articleId)){
			articleId = "0";
		}
		List list = articleManager.articleList(Integer.parseInt(articleId));

		for (Iterator it = list.iterator(); it.hasNext();) {
			Document doc = new Document();
			Article article = (Article) it.next();
			doc.add(new Field("id", String.valueOf(article.getId()), Field.Store.YES,
					Field.Index.UN_TOKENIZED));
			doc.add(new Field("article_title", article.getArticleTitle(), Field.Store.YES,
					Field.Index.TOKENIZED));
			String content = FunctionUtil.Html2Text(article.getArticleContent());
			doc.add(new Field("article_content", content, Field.Store.YES,
					Field.Index.TOKENIZED));
			articleId = String.valueOf(article.getId());
			writer.addDocument(doc);
		}
		writer.optimize();
		writer.close();
		
		//最后一篇文章的id写入txt文件
		this.writerText(articleId);
	}
	
	//从txt文件中读入id
	public String readText(){
		String content = "";
		InputStream in = null;
		try {
			in = AnalyzerAction.class.getResourceAsStream("/date/index/article/" + "articlesId.txt");
			Reader re = new InputStreamReader(in,"UTF-8");
			char[] chs = new char[1024];
			int count;
			
			while((count = re.read(chs)) != -1){
				content += new String(chs,0,count);
			}
			
		} catch (Exception e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} finally{
			if(in != null){
				try {
					in.close();
				} catch (IOException e) {
					// TODO Auto-generated catch block
					e.printStackTrace();
				}
			}
		}
		return content;
		
	}
	//向txt文件中写入id
	public void writerText(String articleId){
		BufferedWriter bw = null;
		try {
			String path = URLDecoder.decode(AnalyzerAction.class
					.getResource("/date/index/article/"+ "articlesId.txt").toString(),"UTF-8").substring(6);
			File file = new File(path);
			bw = new BufferedWriter(new FileWriter(file));
			bw.write(articleId);
			
		} catch (Exception e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}finally{
			if(bw != null){
				try {
					bw.close();
				} catch (IOException e) {
					// TODO Auto-generated catch block
					e.printStackTrace();
				}
			}
		}
	}

分享到：

即时更新索引思路 | lucene与数据表比较

2012-02-25 22:54
浏览 727
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

更新索引策略之一（非繁忙时刻更新）

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

更新索引策略之一（非繁忙时刻更新）

评论

发表评论

相关推荐

分页检索及完善站内搜索

lucene3+IK分词器 改造 lucene2.x+paoding

即时更新索引思路

lucene与数据表比较

lucene+paoding实现全文检索

Lucene2.4 索引库位置介绍

Lucene2.4第一个简单实例

最近访客更多访客>>

lucene3+IK分词器改造 lucene2.x+paoding