- 浏览: 1010175 次
- 性别:
- 来自: 广州
文章分类
- 全部博客 (394)
- OSGI (14)
- 多线程 (10)
- 数据库 (30)
- J2ME (1)
- JAVA基础知识 (46)
- 引用包 (1)
- 设计模式 (7)
- 工作流 (2)
- Ubuntu (7)
- 搜索引擎 (6)
- QT (2)
- Ubuntu下编程 (1)
- 小程序 (2)
- UML (1)
- Servlet (10)
- spring (16)
- IM (12)
- 文档视频转为flash格式在线播放 (19)
- Maven (8)
- 远程调用 (2)
- PHPRPC (1)
- EXTJS学习 (2)
- Hibernate (16)
- 技术文章 (38)
- flex (5)
- 海量数据处理 (5)
- FTP (8)
- JS (10)
- Struts (1)
- hibernate search (13)
- JQuery (2)
- EMail (3)
- 算法 (4)
- SVN (7)
- JFreeChart (4)
- 面试 (4)
- 正规表达式 (2)
- 数据库性能优化 (10)
- JVM (6)
- Http Session Cookie (7)
- 网络 (12)
- Hadoop (2)
- 性能 (1)
最新评论
-
hy1235366:
能够随便也发一下,你退火算法程序使用的DistanceMatr ...
模拟退火算法总结(含例子)(转) -
梅强强:
感谢分享。。帮大忙了
swftools转换文件时线程堵塞问题的解决方法 -
wenlongsust:
openoffice和文件不在同一个服务器上,用过吗?
[JODConverter]word转pdf心得分享(转) -
2047699523:
如何在java Web项目中开发WebService接口htt ...
利用Java编写简单的WebService实例 -
abingpow:
唉,看起来好像很详细很不错的样子,可惜不是篇面向初学者的文章, ...
Spring与OSGi的整合(二)(转)
下面是转载hibernate search的使用文档,例子有些信息是不全的。
这里特别要注意的是加入各种bridge注解的属性里,都要加入field注解,因为不加入field注解的话,就不能进行索引,那加入bridge也就白加了。
4.2. Property/Field Bridge
In Lucene all index fields have to be represented as Strings. For this reason all entity properties annotated with @Field
have to be indexed in a String form. For most of your properties, Hibernate Search does the translation job for you thanks to a built-in set of bridges. In some cases, though you need a more fine grain control over the translation process.
Hibernate Search comes bundled with a set of built-in bridges between a Java property type and its full text representation.
null elements are not indexed. Lucene does not support null elements and this does not make much sense either.
String are indexed as is
Numbers are converted in their String representation. Note that numbers cannot be compared by Lucene (ie used in ranged queries) out of the box: they have to be padded
Note
Using a Range query is debatable and has drawbacks, an alternative approach is to use a Filter query which will filter the result query to the appropriate range.
Hibernate Search will support a padding mechanism
Dates are stored as yyyyMMddHHmmssSSS in GMT time (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You shouldn't really bother with the internal format. What is important is that when using a DateRange Query, you should know that the dates have to be expressed in GMT time.
Usually, storing the date up to the millisecond is not necessary. @DateBridge
defines the appropriate resolution you are willing to store in the index (
). The date pattern will then be truncated accordingly.@DateBridge(resolution=Resolution.DAY)
@Entity @Indexed public class Meeting { @Field(index=Index.UN_TOKENIZED) @DateBridge(resolution=Resolution.MINUTE) private Date date; ...
Warning
A Date whose resolution is lower than MILLISECOND
cannot be a @DocumentId
URI and URL are converted to their string representation
Class are converted to their fully qualified class name. The thread context classloader is used when the class is rehydrated
Sometimes, the built-in bridges of Hibernate Search do not cover some of your property types, or the String representation used by the bridge does not meet your requirements. The following paragraphs describe several solutions to this problem.
The simplest custom solution is to give Hibernate Search an implementation of your expected Object
to String
bridge. To do so you need to implements theorg.hibernate.search.bridge.StringBridge
interface. All implementations have to be thread-safe as they are used concurrently.
Example 4.13. Implementing your own StringBridge
/** * Padding Integer bridge. * All numbers will be padded with 0 to match 5 digits * * @author Emmanuel Bernard */ public class PaddedIntegerBridge implements StringBridge { private int PADDING = 5; public String objectToString(Object object) { String rawInteger = ( (Integer) object ).toString(); if (rawInteger.length() > PADDING) throw new IllegalArgumentException( "Try to pad on a number too big" ); StringBuilder paddedInteger = new StringBuilder( ); for ( int padIndex = rawInteger.length() ; padIndex < PADDING ; padIndex++ ) { paddedInteger.append('0'); } return paddedInteger.append( rawInteger ).toString(); } }
Then any property or field can use this bridge thanks to the @FieldBridge
annotation
@FieldBridge(impl = PaddedIntegerBridge.class) private Integer length;Parameters can be passed to the Bridge implementation making it more flexible. The Bridge implementation implements a
ParameterizedBridge
interface, and the parameters are passed through the @FieldBridge
annotation.
Example 4.14. Passing parameters to your bridge implementation
public class PaddedIntegerBridge implements StringBridge, ParameterizedBridge { public static String PADDING_PROPERTY = "padding"; private int padding = 5; //default public void setParameterValues(Map parameters) { Object padding = parameters.get( PADDING_PROPERTY ); if (padding != null) this.padding = (Integer) padding; } public String objectToString(Object object) { String rawInteger = ( (Integer) object ).toString(); if (rawInteger.length() > padding) throw new IllegalArgumentException( "Try to pad on a number too big" ); StringBuilder paddedInteger = new StringBuilder( ); for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) { paddedInteger.append('0'); } return paddedInteger.append( rawInteger ).toString(); } } //property @FieldBridge(impl = PaddedIntegerBridge.class, params = @Parameter(name="padding", value="10") ) private Integer length;The
ParameterizedBridge
interface can be implemented by StringBridge
, TwoWayStringBridge
, FieldBridge
implementations.
All implementations have to be thread-safe, but the parameters are set during initialization and no special care is required at this stage.
If you expect to use your bridge implementation on an id property (ie annotated with @DocumentId
), you need to use a slightly extended version of StringBridge
named TwoWayStringBridge
. Hibernate Search needs to read the string representation of the identifier and generate the object out of it. There is not difference in the way the @FieldBridge
annotation is used.
Example 4.15. Implementing a TwoWayStringBridge which can for example be used for id properties
public class PaddedIntegerBridge implements TwoWayStringBridge, ParameterizedBridge { public static String PADDING_PROPERTY = "padding"; private int padding = 5; //default public void setParameterValues(Map parameters) { Object padding = parameters.get( PADDING_PROPERTY ); if (padding != null) this.padding = (Integer) padding; } public String objectToString(Object object) { String rawInteger = ( (Integer) object ).toString(); if (rawInteger.length() > padding) throw new IllegalArgumentException( "Try to pad on a number too big" ); StringBuilder paddedInteger = new StringBuilder( ); for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) { paddedInteger.append('0'); } return paddedInteger.append( rawInteger ).toString(); } public Object stringToObject(String stringValue) { return new Integer(stringValue); } } //id property @DocumentId @FieldBridge(impl = PaddedIntegerBridge.class, params = @Parameter(name="padding", value="10") private Integer id;It is critically important for the two-way process to be idempotent (ie object = stringToObject( objectToString( object ) ) ).
Some use cases require more than a simple object to string translation when mapping a property to a Lucene index. To give you the greatest possible flexibility you can also implement a bridge as a FieldBridge
. This interface gives you a property value and let you map it the way you want in your Lucene Document
.The interface is very similar in its concept to the Hibernate UserType
s.
You can for example store a given property in two different document fields:
Example 4.16. Implementing the FieldBridge interface in order to a given property into multiple document fields
/** * Store the date in 3 different fields - year, month, day - to ease Range Query per * year, month or day (eg get all the elements of December for the last 5 years). * * @author Emmanuel Bernard */ public class DateSplitBridge implements FieldBridge { private final static TimeZone GMT = TimeZone.getTimeZone("GMT"); public void set(String name, Object value, Document document, LuceneOptions luceneOptions) { Date date = (Date) value; Calendar cal = GregorianCalendar.getInstance(GMT); cal.setTime(date); int year = cal.get(Calendar.YEAR); int month = cal.get(Calendar.MONTH) + 1; int day = cal.get(Calendar.DAY_OF_MONTH); // set year Field field = new Field(name + ".year", String.valueOf(year), luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector()); field.setBoost(luceneOptions.getBoost()); document.add(field); // set month and pad it if needed field = new Field(name + ".month", month < 10 ? "0" : "" + String.valueOf(month), luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector()); field.setBoost(luceneOptions.getBoost()); document.add(field); // set day and pad it if needed field = new Field(name + ".day", day < 10 ? "0" : "" + String.valueOf(day), luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector()); field.setBoost(luceneOptions.getBoost()); document.add(field); } } //property @FieldBridge(impl = DateSplitBridge.class) private Date date;
4.2.2.3. ClassBridge
It is sometimes useful to combine more than one property of a given entity and index this combination in a specific way into the Lucene index. The @ClassBridge
and @ClassBridge
annotations can be defined at the class level (as opposed to the property level). In this case the custom field bridge implementation receives the entity instance as the value parameter instead of a particular property. Though not shown in this example, @ClassBridge
supports the termVector
attribute discussed in section Section 4.1.1, “Basic mapping”.
Example 4.17. Implementing a class bridge
@Entity @Indexed @ClassBridge(name="branchnetwork", index=Index.TOKENIZED, store=Store.YES, impl = CatFieldsClassBridge.class, params = @Parameter( name="sepChar", value=" " ) ) public class Department { private int id; private String network; private String branchHead; private String branch; private Integer maxEmployees ... } public class CatFieldsClassBridge implements FieldBridge, ParameterizedBridge { private String sepChar; public void setParameterValues(Map parameters) { this.sepChar = (String) parameters.get( "sepChar" ); } public void set(String name, Object value, Document document, LuceneOptions luceneOptions) { // In this particular class the name of the new field was passed // from the name field of the ClassBridge Annotation. This is not // a requirement. It just works that way in this instance. The // actual name could be supplied by hard coding it below. Department dep = (Department) value; String fieldValue1 = dep.getBranch(); if ( fieldValue1 == null ) { fieldValue1 = ""; } String fieldValue2 = dep.getNetwork(); if ( fieldValue2 == null ) { fieldValue2 = ""; } String fieldValue = fieldValue1 + sepChar + fieldValue2; Field field = new Field( name, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector() ); field.setBoost( luceneOptions.getBoost() ); document.add( field ); } }In this example, the particular
CatFieldsClassBridge
is applied to the department
instance, the field bridge then concatenate both branch and network and index the concatenation.
发表评论
-
重建索引
2011-03-24 16:36 2236使用hibernate search后,就会在加载hib ... -
hibernate search 组合搜索方式
2011-03-24 16:06 2390使用hibernate search来搜索一个加入索引的信息时 ... -
lucene3.0范围查找TermRangeQuery(转)
2011-01-24 16:09 2160在lucene3.0中,范围查询也有很大的变化,Range ... -
日期范围查询之hibernate search DateBridge使用
2011-01-24 15:54 4398hibernate search 使用串和数的索引和查询都比较 ... -
hibernate search 和lucene结合使用实例(转)
2011-01-24 09:42 2630以下的代码是根据api帮助文档作出的一个简单实例,在 ... -
hibernate search手动建索引和组合条件搜索(转)
2011-01-21 14:03 2215近日需要做一下搜索的功能,之前就听说过hibernate se ... -
hibernate-search-3.3.0.Final中文文档翻译及学习笔记(转)
2011-01-10 13:48 2327开始只是自己看,没想到要翻译,从第四章开始进行翻译,主要章节基 ... -
基于Spring的Hibernate Search全文检索功能(转)
2011-01-05 15:28 3926最近的一个项 ... -
Hibernate Search ClassBridge来解决附件同步索引的问题(转)
2010-12-16 09:46 1841我有个类 Issue,但是它的附件并不放在数据库当中,而是放在 ... -
将文件内容加入索引
2010-12-07 17:20 1294上一篇是使用hibernate search的一个文档说明, ... -
hibernate search 学习笔记
2010-12-02 13:20 17331 使用hibernate search 时,当你的字段加入了 ... -
Web开发教程12-Hibernate Search(转)
2010-12-02 12:58 3495Hibernate Search是Hibernat ...
相关推荐
它的设计目标是让开发者能够轻松地在应用中加入高级全文检索功能。 **一、Lucene的基本概念** 1. **文档(Document)**:在Lucene中,每个要搜索的文本对象被称为一个文档,文档由多个字段(Field)组成,如标题、...
全文检索是一种特殊的IR形式,它允许用户通过输入文本查询来搜索整个文档的内容。 #### 二、基础知识 ##### 1. 反向索引 反向索引是全文检索系统中最核心的技术之一。它是一种数据结构,用于存储文档中的单词与...
在实践中,Lucene可以处理多种格式的文件,如Word、Excel、PPT和PDF,这些文件通过特定的解析器(如Apache POI和PDFBox)将内容提取出来,然后进行索引。通过封装接口,开发者可以轻松地将这些功能整合到自己的应用...
《Lucene全文检索:简单索引与搜索实例详解》 Lucene是Apache软件基金会的开源项目,是一款强大的全文检索库,被广泛应用于Java开发中,为开发者提供了构建高性能搜索引擎的能力。在本文中,我们将深入探讨如何基于...
在本项目 "FullTextSearch" 中,你将找到一个实际的示例,展示了如何运用 Lucene 进行全文检索。通过研究源代码,你可以更深入地理解 Lucene 的工作原理,以及如何在自己的 Java 应用程序中集成全文搜索功能。这个...
**Lucene 全文检索系统:Java 源码与信息检索技术详解** Lucene 是一个高度可定制的全文检索库,由 Apache 软件基金会维护,它为开发人员提供了一个强大的工具来构建搜索功能。这个压缩包包含了 Lucene 的 Java ...
作为一个高级的搜索引擎工具包,Lucene4 提供了完整的索引和搜索机制,使得在文件和数据库中进行全文检索变得简单高效。在本文中,我们将深入探讨 Lucene4 的核心概念、工作流程以及如何在实际项目中应用。 ### 1. ...
在这个项目中,我们将探讨如何利用Lucene 2.4.0版本与Access数据库结合,实现对数据库内容的全文检索。 首先,我们需要理解Lucene的基本工作原理。Lucene的核心概念包括文档(Document)、字段(Field)和索引...
### 基于Java的全文检索引擎Lucene详细介绍 #### 一、Lucene概述与历史背景 Lucene是一个开源的高性能全文检索库,它由Doug Cutting创建并维护,旨在为各种规模的应用程序提供高效的文本搜索功能。Lucene采用Java...
**全文检索系统与Lucene** 全文检索系统是一种用于在大量文本数据中快速查找相关信息的工具。它通过索引文本中的关键词来实现高效的搜索性能,使得用户可以输入任意词汇或短语,系统能在短时间内返回最相关的文档。...
开放源代码的全文检索引擎Lucene是一种强大的工具,用于在大量文本数据中快速、高效地进行搜索。全文检索系统是能够处理文本中任意词汇的搜索,而不仅仅是精确匹配,它通过建立索引来提高查询速度。Lucene是由Apache...
在这个“Java多级多类型全文检索 - 基于Lucene3.3.0”的主题中,我们将深入探讨如何利用Lucene 3.3.0版本来实现复杂且高效的检索机制,支持多种文件类型和多层次的索引构建。 首先,Lucene是一个开源的全文检索框架...
- **创建索引**:首先,我们需要遍历本地文件系统,读取每个文件的内容,并使用分词器将内容拆分成关键词。然后,将这些关键词及其在原文档中的位置信息保存到索引中。 - **搜索**:用户输入查询字符串后,Lucene会...
在这个“Lucene.Net全文检索Demo”中,我们将深入探讨其核心功能及实现原理。 1. **基础概念** - **索引**:在Lucene.Net中,全文检索依赖于索引。索引类似于图书的目录,将文档内容转换为便于搜索的结构。 - **...
"最新全文检索系统开源lucene资料大全"这个资料包很可能包含了Lucene的使用教程、API参考、实战案例等内容,帮助初学者快速入门并掌握Lucene的核心概念和技术。通过阅读PDF文档,你可以了解如何安装、配置、索引文档...
对于`.txt`, `.htm`, `.html`文件,我们将文件路径、文件名和内容分别作为字段存储并索引。`Field.Store.YES`表示存储字段,`Field.Index.TOKENIZED`表示该字段会被分词处理。 对于`.doc`文件,我们需要使用...
**Lucene全文检索Word2007** Lucene是一个开源的全文搜索引擎库,由Apache软件基金会开发并维护。它提供了一个高效、可扩展的搜索框架,使得开发者能够在其应用程序中集成高级的搜索功能。在本示例中,我们讨论的是...