- 浏览: 228528 次
- 性别:
- 来自: 海南海口
文章分类
- 全部博客 (114)
- java基础 (25)
- 设计模式 (6)
- css (1)
- js (2)
- jquery (5)
- flash as3.0 (3)
- lucene (2)
- tomcat (3)
- uml (0)
- struts2 (7)
- spring (0)
- sql (1)
- ejb3.0 (2)
- jbpm4 (1)
- webservices (1)
- linux (3)
- ajax (1)
- 面试 (1)
- flex (0)
- soa (0)
- oracle解锁 (5)
- 工具 (3)
- ext (3)
- 好的网址 (1)
- junit (2)
- jmx (2)
- encache (1)
- redis (1)
- 网站 (1)
- oracle重要的sql (1)
- web (3)
- hadoop (2)
- DB2 (1)
- ui (1)
- sybase (1)
- ue使用快捷键 (1)
- eclipse优化 (1)
- 前端优化用到的插件 (1)
- zookeeper (1)
- solr (1)
- hibernate (1)
- svn (1)
- resion (1)
- resin (1)
- maven (1)
- mysql (1)
- url (1)
- 通过HttpFileServer设置共享 可以通过http方式访问 (1)
- 非技术 (2)
- 营销 (1)
- ELK (3)
最新评论
-
it_xiaowu:
jqwerty_123 写道我的出同样的问题却是因为引入cxf ...
java.lang.NoSuchMethodError: org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Ma -
繁星水:
实验证明可用,最后补充一下,可以不需要 set Package ...
axis根据wsdl生成java客户端代码 -
qq_16699317:
qq_16699317 写道求一份源代码,感激不尽。。。多谢了 ...
java博客系统 -
qq_16699317:
求一份源代码,感激不尽。。。多谢了
java博客系统 -
jqwerty_123:
我的出同样的问题却是因为引入cxf的时候jcl-over-sl ...
java.lang.NoSuchMethodError: org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Ma
package cn.ljzblog.ljz.util;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import jeasy.analysis.MMAnalyzer;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.LockObtainFailedException;
import test.LuceneJEAnalyzerText;
import cn.ljzblog.ljz.model.Article;
import cn.ljzblog.ljz.model.LogQueryTemp;
import cn.ljzblog.ljz.model.PicQueryTemp;
/**
* @theme 主要是封装lucene索引和查询索引的方法
* @author ljz
* @since 2010/10/10
*/
public class LuceneQuery {
private static String indexPath;// 索引生成的目录
private static IndexWriter indexWriter;//
private IndexSearcher searcher;
private Directory dir;
private String prefixHTML = "<font color='red'>";
private String suffixHTML = "</font>";
private Date date;
private Date date2;
public LuceneQuery(String indexPath) {
date = new Date();
this.indexPath = indexPath;
}
/**
* 函数功能:建立索引
*
* @param list
*/
public void createIndex(List<Article> list) {
createIndexWriter();// 生成indexWriter对象
for (Article article : list) {
Document document = new Document();// 生成Document对象
Field field = new Field("articleId", String.valueOf(article
.getArticleId()), Field.Store.YES, Field.Index.NO);
Field field2 = new Field("articleName", article.getArticleName(),
Field.Store.YES, Field.Index.ANALYZED);
Field field3 = new Field("articleContent", article
.getArticleContent(), Field.Store.YES, Field.Index.NO);
document.add(field);
document.add(field2);
document.add(field3);
try {
indexWriter.addDocument(document);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
closeIndexWriter(indexWriter);
}
/**
* 函数功能:建立高亮显示查询的
*
*/
public void createHightLightIndex(List<LogQueryTemp> list) {
createIndexWriter();// 生成indexWriter对象
for (LogQueryTemp logQuery : list) {
Document document = new Document();// 生成Document对象
Field field = new Field("articleId", String.valueOf(logQuery
.getArticleId()), Field.Store.YES, Field.Index.NO);
Field field2 = new Field("articleName", logQuery.getArticleName(),
Field.Store.YES, Field.Index.ANALYZED);
Field field3 = new Field("articleKindName", logQuery
.getArticleKindName(), Field.Store.YES, Field.Index.NO);
Field field4 = new Field("writeTime", logQuery.getWriteTime(),
Field.Store.YES, Field.Index.NO);
Field field5 = new Field("articleContent", logQuery
.getArticleContent(), Field.Store.YES, Field.Index.ANALYZED);
document.add(field);
document.add(field2);
document.add(field3);
document.add(field4);
document.add(field5);
try {
indexWriter.addDocument(document);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
closeIndexWriter(indexWriter);
}
/**
* 函数功能:建立查询图片的索引
*
*/
public void createPicIndex(List<PicQueryTemp> list) {
createIndexWriter();// 生成indexWriter对象
for (PicQueryTemp picQuery : list) {
Document document = new Document();// 生成Document对象
Field field = new Field("picId", String.valueOf(
picQuery.getPicId()), Field.Store.YES, Field.Index.NO);
Field field2 = new Field("picGroupId", String.valueOf(picQuery.getPicGroupId()),
Field.Store.YES, Field.Index.NO);
Field field3 = new Field("picName",
picQuery.getPicName(), Field.Store.YES, Field.Index.NO);
Field field4 = new Field("pictureDetail", picQuery.getPictureDetail(),
Field.Store.YES, Field.Index.ANALYZED);
document.add(field);
document.add(field2);
document.add(field3);
document.add(field4);
try {
indexWriter.addDocument(document);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
closeIndexWriter(indexWriter);
}
/**
* 函数功能 :关闭indexWriter
*
* @param indexWriter
*/
public void closeIndexWriter(IndexWriter indexWriter) {
try {
indexWriter.optimize();
indexWriter.close();
date2 = new Date();
System.out.println("建立索引总共用了:" + (date2.getTime() - date.getTime())
+ "毫秒");
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 函数功能:生成indexWriter对象
*/
public void createIndexWriter() {
try {
boolean flag = true;
//如果已存在索引,则追加索引
if(IndexReader.indexExists(indexPath))
{
flag = false;
}
indexWriter = new IndexWriter(indexPath, new StandardAnalyzer(),
flag, IndexWriter.MaxFieldLength.LIMITED);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (LockObtainFailedException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 函数功能:查询索引
*/
public List<Article> queryIndex(String indexPath, String findContent) {
List<Article> list2 = new ArrayList<Article>();
System.out.println("查询内容为:" + findContent);
try {
dir = FSDirectory.getDirectory(indexPath);
IndexSearcher searcher = new IndexSearcher(dir);
QueryParser parser = new QueryParser("articleName",
new StandardAnalyzer());
try {
Query query = parser.parse(findContent);// 根据查询内容进行查询
TopDocs topDocs = searcher.search(query, 5);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println("总共查询到" + hits.length + "条记录");
for (int i = 0; i < hits.length; i++) {
int DocId = hits[i].doc;
Article article = new Article();
Document doc = searcher.doc(DocId);
System.out.println("id:" + doc.get("articleId")
+ ",articleName:" + doc.get("articleName"));
article
.setArticleId(Integer
.parseInt(doc.get("articleId")));
article.setArticleName(doc.get("articleName"));
article.setArticleContent(doc.get("articleContent"));
list2.add(article);
// list2.add(doc.get("sname"));
}
Date date3 = new Date();
System.out.println("查询总共花的时间为:"
+ (date3.getTime() - date2.getTime()) + "毫秒");
} catch (ParseException e) {
e.printStackTrace();
}
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("list2的长度为:" + list2.size());
return list2;
}
/**
* 函数功能:通过booleanQuery方法进行查询索引
*/
public List<Article> queryIndexByBooleanQuery(String indexPath,
String findContent) {
List<Article> list2 = new ArrayList<Article>();
LuceneJEAnalyzerText jeAnalyzer = new LuceneJEAnalyzerText(indexPath);
System.out.println("查询内容为:" + findContent);
BooleanQuery booleanQuery = new BooleanQuery();
try {
dir = FSDirectory.getDirectory(indexPath);
searcher = new IndexSearcher(dir);
String[] str = jeAnalyzer.createAnalyzer(findContent).split(",");
for (int i = 0; i < str.length; i++) {
booleanQuery.add(
new TermQuery(new Term("articleName", str[i])),
BooleanClause.Occur.SHOULD);
}
// QueryParser parser = new QueryParser("articleName", new
// StandardAnalyzer());
// Query query = parser.parse(findContent);//根据查询内容进行查询
TopDocs topDocs = searcher.search(booleanQuery, 3);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println("总共查询到" + hits.length + "条记录");
for (int i = 0; i < hits.length; i++) {
int DocId = hits[i].doc;
Article article = new Article();
Document doc = searcher.doc(DocId);
System.out.println("id:" + doc.get("articleId")
+ ",articleName:" + doc.get("articleName"));
article.setArticleId(Integer.parseInt(doc.get("articleId")));
article.setArticleName(doc.get("articleName"));
// article.setArticleContent(doc.get("articleContent"));
list2.add(article);
// list2.add(doc.get("sname"));
}
if (hits.length > 0) {
list2.remove(0);
}
Date date3 = new Date();
System.out.println("查询总共花的时间为:"
+ (date3.getTime() - date2.getTime()) + "毫秒");
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("list2的长度为:" + list2.size());
return list2;
}
/**
* 函数功能:高亮显示查询
*
* @param fieldName
* @param keyword
* @throws CorruptIndexException
* @throws IOException
* @throws ParseException
*/
public List<LogQueryTemp> search( String keyword)
throws CorruptIndexException, IOException, ParseException {
List<LogQueryTemp> listlog = new ArrayList<LogQueryTemp>();
searcher = new IndexSearcher(indexPath);
Analyzer analyzer = new MMAnalyzer();
//QueryParser queryParse = new QueryParser(fieldName, analyzer);
//Query query = queryParse.parse(keyword);
LuceneJEAnalyzerText jeAnalyzer = new LuceneJEAnalyzerText(indexPath);
System.out.println("查询内容为:" + keyword);
BooleanQuery booleanQuery = new BooleanQuery();
try{
dir = FSDirectory.getDirectory(indexPath);
searcher = new IndexSearcher(dir);
String[] str = jeAnalyzer.createAnalyzer(keyword).split(",");
for (int j = 0; j < str.length; j++) {
booleanQuery.add(
new TermQuery(new Term("articleContent", str[j])),
BooleanClause.Occur.SHOULD);
booleanQuery.add(
new TermQuery(new Term("articleName", str[j])),
BooleanClause.Occur.SHOULD);
}
Hits hits = searcher.search(booleanQuery);
for (int i = 0; i < hits.length(); i++) {
LogQueryTemp logQuery = new LogQueryTemp();
Document doc = hits.doc(i);
String text = doc.get("articleContent");
String text2 = doc.get("articleName");
int htmlLength = prefixHTML.length() + suffixHTML.length();
//System.out.println("高亮HTML的总长度为" + htmlLength);
SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter(
prefixHTML, suffixHTML);
Highlighter highlighter = new Highlighter(simpleHTMLFormatter,
new QueryScorer(booleanQuery));
String highLightText = highlighter.getBestFragment(analyzer,
"articleContent", text);
String highLightText2 = highlighter.getBestFragment(analyzer,
"articleName", text2);
System.out.println("★高亮显示第 " + (i + 1) + " 条检索结果如下所示:");
System.out.println(highLightText);
logQuery.setArticleId(doc.get("articleId"));
logQuery.setArticleName(highLightText2);
logQuery.setArticleKindName(doc.get("articleKindName"));
logQuery.setArticleContent(highLightText);
logQuery.setWriteTime(doc.get("writeTime"));
listlog.add(logQuery);
System.out.println("显示第 " + (i + 1) + " 条检索结果摘要的长度为(含高亮HTML代码):"
+ highLightText.length());
}
searcher.close();
}
catch(IOException ex){
ex.printStackTrace();
}
System.out.println("高亮显示的长度为:"+listlog.size());
return listlog;
}
public List<PicQueryTemp> queryPicByLucene(String findContent){
List<PicQueryTemp> list2 = new ArrayList<PicQueryTemp>();
LuceneJEAnalyzerText jeAnalyzer = new LuceneJEAnalyzerText(indexPath);
System.out.println("查询内容为:" + findContent);
BooleanQuery booleanQuery = new BooleanQuery();
try {
dir = FSDirectory.getDirectory(indexPath);
searcher = new IndexSearcher(dir);
String[] str = jeAnalyzer.createAnalyzer(findContent).split(",");
for (int i = 0; i < str.length; i++) {
booleanQuery.add(
new TermQuery(new Term("pictureDetail", str[i])),
BooleanClause.Occur.SHOULD);
}
TopDocs topDocs = searcher.search(booleanQuery,12);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println("总共查询到" + hits.length + "条记录");
for (int i = 0; i < hits.length; i++) {
int DocId = hits[i].doc;
PicQueryTemp picQuery = new PicQueryTemp();
Document doc = searcher.doc(DocId);
picQuery.setPicGroupId(Integer.parseInt(doc.get("picGroupId")));
picQuery.setPicId(Integer.parseInt(doc.get("picId")));
picQuery.setPicName(doc.get("picName"));
picQuery.setPictureDetail(doc.get("pictureDetail"));
list2.add(picQuery);
// list2.add(doc.get("sname"));
}
Date date3 = new Date();
System.out.println("查询总共花的时间为:"
+ (date3.getTime() - date2.getTime()) + "毫秒");
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("list2的长度为:" + list2.size());
return list2;
}
}
import java.io.IOException;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import jeasy.analysis.MMAnalyzer;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.LockObtainFailedException;
import test.LuceneJEAnalyzerText;
import cn.ljzblog.ljz.model.Article;
import cn.ljzblog.ljz.model.LogQueryTemp;
import cn.ljzblog.ljz.model.PicQueryTemp;
/**
* @theme 主要是封装lucene索引和查询索引的方法
* @author ljz
* @since 2010/10/10
*/
public class LuceneQuery {
private static String indexPath;// 索引生成的目录
private static IndexWriter indexWriter;//
private IndexSearcher searcher;
private Directory dir;
private String prefixHTML = "<font color='red'>";
private String suffixHTML = "</font>";
private Date date;
private Date date2;
public LuceneQuery(String indexPath) {
date = new Date();
this.indexPath = indexPath;
}
/**
* 函数功能:建立索引
*
* @param list
*/
public void createIndex(List<Article> list) {
createIndexWriter();// 生成indexWriter对象
for (Article article : list) {
Document document = new Document();// 生成Document对象
Field field = new Field("articleId", String.valueOf(article
.getArticleId()), Field.Store.YES, Field.Index.NO);
Field field2 = new Field("articleName", article.getArticleName(),
Field.Store.YES, Field.Index.ANALYZED);
Field field3 = new Field("articleContent", article
.getArticleContent(), Field.Store.YES, Field.Index.NO);
document.add(field);
document.add(field2);
document.add(field3);
try {
indexWriter.addDocument(document);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
closeIndexWriter(indexWriter);
}
/**
* 函数功能:建立高亮显示查询的
*
*/
public void createHightLightIndex(List<LogQueryTemp> list) {
createIndexWriter();// 生成indexWriter对象
for (LogQueryTemp logQuery : list) {
Document document = new Document();// 生成Document对象
Field field = new Field("articleId", String.valueOf(logQuery
.getArticleId()), Field.Store.YES, Field.Index.NO);
Field field2 = new Field("articleName", logQuery.getArticleName(),
Field.Store.YES, Field.Index.ANALYZED);
Field field3 = new Field("articleKindName", logQuery
.getArticleKindName(), Field.Store.YES, Field.Index.NO);
Field field4 = new Field("writeTime", logQuery.getWriteTime(),
Field.Store.YES, Field.Index.NO);
Field field5 = new Field("articleContent", logQuery
.getArticleContent(), Field.Store.YES, Field.Index.ANALYZED);
document.add(field);
document.add(field2);
document.add(field3);
document.add(field4);
document.add(field5);
try {
indexWriter.addDocument(document);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
closeIndexWriter(indexWriter);
}
/**
* 函数功能:建立查询图片的索引
*
*/
public void createPicIndex(List<PicQueryTemp> list) {
createIndexWriter();// 生成indexWriter对象
for (PicQueryTemp picQuery : list) {
Document document = new Document();// 生成Document对象
Field field = new Field("picId", String.valueOf(
picQuery.getPicId()), Field.Store.YES, Field.Index.NO);
Field field2 = new Field("picGroupId", String.valueOf(picQuery.getPicGroupId()),
Field.Store.YES, Field.Index.NO);
Field field3 = new Field("picName",
picQuery.getPicName(), Field.Store.YES, Field.Index.NO);
Field field4 = new Field("pictureDetail", picQuery.getPictureDetail(),
Field.Store.YES, Field.Index.ANALYZED);
document.add(field);
document.add(field2);
document.add(field3);
document.add(field4);
try {
indexWriter.addDocument(document);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
closeIndexWriter(indexWriter);
}
/**
* 函数功能 :关闭indexWriter
*
* @param indexWriter
*/
public void closeIndexWriter(IndexWriter indexWriter) {
try {
indexWriter.optimize();
indexWriter.close();
date2 = new Date();
System.out.println("建立索引总共用了:" + (date2.getTime() - date.getTime())
+ "毫秒");
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 函数功能:生成indexWriter对象
*/
public void createIndexWriter() {
try {
boolean flag = true;
//如果已存在索引,则追加索引
if(IndexReader.indexExists(indexPath))
{
flag = false;
}
indexWriter = new IndexWriter(indexPath, new StandardAnalyzer(),
flag, IndexWriter.MaxFieldLength.LIMITED);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (LockObtainFailedException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 函数功能:查询索引
*/
public List<Article> queryIndex(String indexPath, String findContent) {
List<Article> list2 = new ArrayList<Article>();
System.out.println("查询内容为:" + findContent);
try {
dir = FSDirectory.getDirectory(indexPath);
IndexSearcher searcher = new IndexSearcher(dir);
QueryParser parser = new QueryParser("articleName",
new StandardAnalyzer());
try {
Query query = parser.parse(findContent);// 根据查询内容进行查询
TopDocs topDocs = searcher.search(query, 5);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println("总共查询到" + hits.length + "条记录");
for (int i = 0; i < hits.length; i++) {
int DocId = hits[i].doc;
Article article = new Article();
Document doc = searcher.doc(DocId);
System.out.println("id:" + doc.get("articleId")
+ ",articleName:" + doc.get("articleName"));
article
.setArticleId(Integer
.parseInt(doc.get("articleId")));
article.setArticleName(doc.get("articleName"));
article.setArticleContent(doc.get("articleContent"));
list2.add(article);
// list2.add(doc.get("sname"));
}
Date date3 = new Date();
System.out.println("查询总共花的时间为:"
+ (date3.getTime() - date2.getTime()) + "毫秒");
} catch (ParseException e) {
e.printStackTrace();
}
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("list2的长度为:" + list2.size());
return list2;
}
/**
* 函数功能:通过booleanQuery方法进行查询索引
*/
public List<Article> queryIndexByBooleanQuery(String indexPath,
String findContent) {
List<Article> list2 = new ArrayList<Article>();
LuceneJEAnalyzerText jeAnalyzer = new LuceneJEAnalyzerText(indexPath);
System.out.println("查询内容为:" + findContent);
BooleanQuery booleanQuery = new BooleanQuery();
try {
dir = FSDirectory.getDirectory(indexPath);
searcher = new IndexSearcher(dir);
String[] str = jeAnalyzer.createAnalyzer(findContent).split(",");
for (int i = 0; i < str.length; i++) {
booleanQuery.add(
new TermQuery(new Term("articleName", str[i])),
BooleanClause.Occur.SHOULD);
}
// QueryParser parser = new QueryParser("articleName", new
// StandardAnalyzer());
// Query query = parser.parse(findContent);//根据查询内容进行查询
TopDocs topDocs = searcher.search(booleanQuery, 3);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println("总共查询到" + hits.length + "条记录");
for (int i = 0; i < hits.length; i++) {
int DocId = hits[i].doc;
Article article = new Article();
Document doc = searcher.doc(DocId);
System.out.println("id:" + doc.get("articleId")
+ ",articleName:" + doc.get("articleName"));
article.setArticleId(Integer.parseInt(doc.get("articleId")));
article.setArticleName(doc.get("articleName"));
// article.setArticleContent(doc.get("articleContent"));
list2.add(article);
// list2.add(doc.get("sname"));
}
if (hits.length > 0) {
list2.remove(0);
}
Date date3 = new Date();
System.out.println("查询总共花的时间为:"
+ (date3.getTime() - date2.getTime()) + "毫秒");
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("list2的长度为:" + list2.size());
return list2;
}
/**
* 函数功能:高亮显示查询
*
* @param fieldName
* @param keyword
* @throws CorruptIndexException
* @throws IOException
* @throws ParseException
*/
public List<LogQueryTemp> search( String keyword)
throws CorruptIndexException, IOException, ParseException {
List<LogQueryTemp> listlog = new ArrayList<LogQueryTemp>();
searcher = new IndexSearcher(indexPath);
Analyzer analyzer = new MMAnalyzer();
//QueryParser queryParse = new QueryParser(fieldName, analyzer);
//Query query = queryParse.parse(keyword);
LuceneJEAnalyzerText jeAnalyzer = new LuceneJEAnalyzerText(indexPath);
System.out.println("查询内容为:" + keyword);
BooleanQuery booleanQuery = new BooleanQuery();
try{
dir = FSDirectory.getDirectory(indexPath);
searcher = new IndexSearcher(dir);
String[] str = jeAnalyzer.createAnalyzer(keyword).split(",");
for (int j = 0; j < str.length; j++) {
booleanQuery.add(
new TermQuery(new Term("articleContent", str[j])),
BooleanClause.Occur.SHOULD);
booleanQuery.add(
new TermQuery(new Term("articleName", str[j])),
BooleanClause.Occur.SHOULD);
}
Hits hits = searcher.search(booleanQuery);
for (int i = 0; i < hits.length(); i++) {
LogQueryTemp logQuery = new LogQueryTemp();
Document doc = hits.doc(i);
String text = doc.get("articleContent");
String text2 = doc.get("articleName");
int htmlLength = prefixHTML.length() + suffixHTML.length();
//System.out.println("高亮HTML的总长度为" + htmlLength);
SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter(
prefixHTML, suffixHTML);
Highlighter highlighter = new Highlighter(simpleHTMLFormatter,
new QueryScorer(booleanQuery));
String highLightText = highlighter.getBestFragment(analyzer,
"articleContent", text);
String highLightText2 = highlighter.getBestFragment(analyzer,
"articleName", text2);
System.out.println("★高亮显示第 " + (i + 1) + " 条检索结果如下所示:");
System.out.println(highLightText);
logQuery.setArticleId(doc.get("articleId"));
logQuery.setArticleName(highLightText2);
logQuery.setArticleKindName(doc.get("articleKindName"));
logQuery.setArticleContent(highLightText);
logQuery.setWriteTime(doc.get("writeTime"));
listlog.add(logQuery);
System.out.println("显示第 " + (i + 1) + " 条检索结果摘要的长度为(含高亮HTML代码):"
+ highLightText.length());
}
searcher.close();
}
catch(IOException ex){
ex.printStackTrace();
}
System.out.println("高亮显示的长度为:"+listlog.size());
return listlog;
}
public List<PicQueryTemp> queryPicByLucene(String findContent){
List<PicQueryTemp> list2 = new ArrayList<PicQueryTemp>();
LuceneJEAnalyzerText jeAnalyzer = new LuceneJEAnalyzerText(indexPath);
System.out.println("查询内容为:" + findContent);
BooleanQuery booleanQuery = new BooleanQuery();
try {
dir = FSDirectory.getDirectory(indexPath);
searcher = new IndexSearcher(dir);
String[] str = jeAnalyzer.createAnalyzer(findContent).split(",");
for (int i = 0; i < str.length; i++) {
booleanQuery.add(
new TermQuery(new Term("pictureDetail", str[i])),
BooleanClause.Occur.SHOULD);
}
TopDocs topDocs = searcher.search(booleanQuery,12);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println("总共查询到" + hits.length + "条记录");
for (int i = 0; i < hits.length; i++) {
int DocId = hits[i].doc;
PicQueryTemp picQuery = new PicQueryTemp();
Document doc = searcher.doc(DocId);
picQuery.setPicGroupId(Integer.parseInt(doc.get("picGroupId")));
picQuery.setPicId(Integer.parseInt(doc.get("picId")));
picQuery.setPicName(doc.get("picName"));
picQuery.setPictureDetail(doc.get("pictureDetail"));
list2.add(picQuery);
// list2.add(doc.get("sname"));
}
Date date3 = new Date();
System.out.println("查询总共花的时间为:"
+ (date3.getTime() - date2.getTime()) + "毫秒");
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("list2的长度为:" + list2.size());
return list2;
}
}
相关推荐
3. 构建查询:使用QueryParser或者QueryBuilder创建查询对象,指定查询字段和查询字符串。 4. 执行搜索:调用IndexSearcher的search方法,传入查询对象和TopDocs参数,获取匹配的文档及其分数。 5. 处理结果:遍历...
通过这些代码,我们可以学习到Lucene的核心概念,包括索引构建、分词分析和查询执行,这对于理解全文检索系统的工作原理非常有帮助。在实际应用中,可以根据具体需求调整Analyzer和QueryParser的设置,以满足更复杂...
7. **写入Document并建立索引**:调用`IndexWriter.addDocument()`方法将Document写入索引。 8. **优化索引**:`IndexWriter.optimize()`可以合并索引段,提高查询性能。 9. **关闭索引写入器**:完成所有操作后,...
**Lucene索引和查询** Lucene是Apache软件基金会的开放源码全文搜索引擎库,它提供了文本检索的核心工具,使得开发者能够快速构建自己的搜索应用。本项目中的代码旨在展示如何利用Lucene对多个文件夹下的数据进行...
**标题:“Lucene建立索引”** **描述分析:** ...学习并实践“Lucene建立索引”,不仅可以深入了解倒排索引的工作原理,还能提升处理大规模文本数据的能力,为后续的全文搜索和数据分析打下坚实基础。
lucene-core-3.0.0.jar是Lucene的核心库,包含了构建和查询索引的基本功能。 二、分词的重要性 中文分词是中文信息处理的基础,它将连续的汉字序列切分成具有独立语义的词语。对于搜索引擎而言,准确的分词能够提高...
- Lucene的核心能力在于文档索引和查询,它提供了强大的API来实现高效的文档检索。 2. **XML简介** - XML(Extensible Markup Language,可扩展标记语言)是一种用来标记数据的语言,它定义了用于描述结构化文档...
- **索引构建**:Lucene支持增量索引和批量索引,可以处理数据源的小幅变化或大规模数据。数据库通常需要全量重建索引,尤其是在数据发生变化时。 - **结果输出**:数据库查询返回RecordSet,而Lucene查询返回Hits...
2. **索引构建的灵活性**:建立索引的方法并非固定不变,可以根据自己的需求和理解来设计。Lucene的核心原理是将数据转换为可搜索的索引结构。尽管可以借鉴他人的实现,但最好理解其原理,以便根据实际情况调整。 3...
**建立索引的步骤** 1. **添加依赖**:在MyEclipse10中,首先需要导入Lucene相关的jar包,这些通常包括lucene-core、lucene-analyzers、lucene-queryparser等,确保所有必要的组件都已引入。 2. **创建索引目录**...
数据库和Lucene建立索引都是为了查找方便,但是数据库仅仅针对部分字段进行建立,且需要把数据转化为格式化信息,并予以保存。而全文检索是将全部信息按照一定方式进行索引。 Lucene的架构设计主要包括两块:一是...
在这个场景中,我们主要关注的是如何在Lucene 4.10.3版本中对数据进行索引和查询。 **1. Lucene 4.10.3 的核心概念** - **文档(Document)**:在 Lucene 中,一个文档对应于要索引的信息源,比如网页、电子邮件...
- **分词器(Tokenizer)**:分词器将输入的文本分解为一系列的词语,这是建立索引的第一步。 - **分析器(Analyzer)**:分析器结合了分词器、过滤器等,负责对文本进行预处理,如去除停用词、词形还原等。 ### 2...
3. 根据控制台输出,观察Lucene建立索引和查询的过程。 在Eclipse中运行和调试Lucene 2.4.1源码,有助于我们深入理解其内部机制,包括倒排索引的构建、查询解析、评分策略等核心概念。同时,这也为自定义分析器、...
Lucene(这里用到的是Lucene.net版本也成为DotLucene)是一个信息检索的函数库(Library),利用它你可以为你的应用加上索引和搜索的功能. Lucene的使用者不需要深入了解有关全文检索的知识,仅仅学会使用库中的一个类,...
**Lucene.net 知识点详解** Lucene.net 是 Apache Lucene 的 .NET 版本,...以上就是关于 Lucene.net 建立索引、检索及分页的关键知识点。理解并熟练运用这些概念,可以帮助开发者构建高效、灵活的全文搜索解决方案。
Lucene首先需要理解的是它的核心概念,包括文档(Document)、字段(Field)、术语(Term)和倒排索引(Inverted Index)。每个文档由多个字段组成,字段内包含文本内容。Lucene通过分析这些文本,将其拆分为术语,...
Lucene是一个开源的Java库,提供了强大的文本分析、索引和搜索功能,被广泛应用于各种信息检索系统中。 第一章 引言 在信息爆炸的时代,搜索引擎成为人们获取信息的重要工具。Apache Lucene作为一款强大的全文搜索...
主要将如何使用Lucene建立索引以及搜索进行了代码的实现,有利于初学者熟悉Lucene的基本功能。