`
hz_chenwenbiao
  • 浏览: 1008057 次
  • 性别: Icon_minigender_1
  • 来自: 广州
社区版块
存档分类
最新评论

hibernate-search-3.3.0.Final中文文档翻译及学习笔记(转)

阅读更多

开始只是自己看,没想到要翻译,从第四章开始进行翻译,主要章节基本全部进行了翻译。文档中前面是英文,后面是中文翻译,一一对应。

5、Tuning Lucene indexing performance. 2

ch4. 3

4.3. Analysis 4

4.4. Bridges 4

4.4.1. Built-in bridges 4

4.4.2. Custom bridges 5

Important 9

4.5. Providing your own id. 13

4.6. Programmatic API 13

Chapter 5. Querying. 13

Note. 15

5.1. Building queries 15

5.1.1. Building a Lucene query using the Lucene API 15

5.1.2. Building a Lucene query with the Hibernate Search query DSL. 15

Note. 18

Note. 20

5.1.3. Building a Hibernate Search query. 24

Tip. 26

Chapter 6. Manual index changes 28

6.1. Adding instances to the index. 28

6.2. Deleting instances from the index. 29

Note. 30

6.3. Rebuilding the whole index. 30

6.3.1. Using flushToIndexes() 31

Note. 32

6.3.2. Using a MassIndexer 32

Warning. 33

Tip. 34

Note. 34

Chapter 7. Index Optimization. 35

7.1. Automatic optimization. 36

7.2. Manual optimization. 36

Note. 37

7.3. Adjusting optimization


1

You  can  think  of  those  two  batch  modes  (no  scope  vs  transactional)  as  the  equivalent  of the (infamous) autocommit vs  transactional behavior. From a performance perspective,  the  intransaction mode is recommended. The scoping choice is made transparently. Hibernate Search detects the presence of a transaction and adjust the scoping.

 

the  intransaction mode is recommended

 

2

The  good  news  is  that  Hibernate  Search  is  enabled  out  of  the  box  when  detected on  the  classpath by  Hibernate  Core.  If,  for  some  reason  you  need  to  disable  it,  set hibernate.search.autoregister_listeners  to false.  Note  that  there  is  no  performance penalty when the listeners are enabled but no entities are annotated as indexed.

 

3

By default, every  time an object  is  inserted, updated or deleted  through Hibernate, Hibernate Search updates  the according Lucene  index.  It  is sometimes desirable  to disable  that  features if either your  index  is read-only or  if  index updates are done  in a batch way  (see Section 6.3,“Rebuilding the whole index”).

 

To disable event based indexing, set

hibernate.search.indexing_strategy = manual

 

4

 

The different reader strategies are described in Reader strategy. Out of the box strategies are

shared: share index readers across several queries. This strategy is the most efficient.

* not-shared: create an index reader for each individual query

The default reader strategy is shared. This can be adjusted:

hibernate.search.reader.strategy = not-shared

 

5Tuning Lucene indexing performance

hibernate.search.[default|<indexname>].exclusive_index_useSet to truewhen no other process will need to write to the same index. This will enable Hibernate Search to work in exlusive mode on the index and improve performance when writing changes to the index. Default valuefalse (releases locks as soon as possible)

 

When your architecture permits it, always set hibernate.search.default.exclusive_index_use=true as it greatly improves efficiency in index writing.

 

6LockFactory configuration

 

Lucene Directorys have default locking strategies which work well for most cases, but it's possible to specify for each index managed by Hibernate Search which LockingFactory you want to use.

 

ch4

Hibernate Search的配置必须使用注解,目前不提供xml配置。

7@Indexed

Foremost we must declare a persistent class as indexable. This is done by annotating the class

with @Indexed (all entities not annotated with @Indexed will be ignored by the indexing process):

不使用@Indexed注解的实体将被忽略,即不被索引。

You can optionially specify the index attribute of the @Indexed annotation to change the default name of the index. For more information see Section 3.2, “Directory configuration”.

你可以使用“index”属性改变默认的索引名。

8@Field

For each property (or attribute) of your entity, you have the ability to describe how it will be indexed. The default (no annotation present) means that the property is ignored by the indexing process. @Field does declare a property as indexed and allows to configure several aspects of the indexing process by setting one or more of the following attributes:

你可以使用@Field来描述实体类的每一个属性。如果属性不加上@Field注解该属性将被忽略。可以使用如下的属性进一步描述@Field

name : describe under which name, the property should be stored in the Lucene Document. The default value is the property name (following the JavaBeans convention)

name:描述了存在在Lucene Document中的名称,默认使用属性的名称。

store  :  describe  whether  or  not  the  property  is  stored  in  the  Lucene  index.  You  can store  the  value Store.YES  (consuming  more  space  in  the  index  but  allowing  projection, see Section 5.1.3.5,  “Projection”), store  it  in a compressed way Store.COMPRESS  (this does consume more CPU), or avoid any storage Store.NO(this is the default value). When a property is stored, you can retrieve its original value from the Lucene Document. This is not related to whether the element is indexed or not.

store:描述了实体类的字段是否被存储在Lucene Index中。

Stroe.Yes:存储在Index中,需要更多的存储空间,但是允许projection

Store.COMPRESS:压缩存储,需要使用更多的CPU

Store.NO:不存储,默认值。

当实体的字段被存储,你可以从Lucene Document检索它的原始值,这与该元素是否被索引无关。

index: describe how  the element  is  indexed and  the  type of  information store. The different values are Index.NO  (no  indexing,  ie  cannot be  found by a query), Index.TOKENIZED  (use an  analyzer  to  process  the property),  Index.UN_TOKENIZED  (no  analyzer  pre-processing), Index.NO_NORMS (do not store the normalizationdata). The default value is TOKENIZED.

index:描述了实体的字段被索引和存储信息。

Index.NO:不被索引,因此无法通过查找查询。

Index.TOKENIZED:使用分词器进行分词并存储。

Index.UN_TOKENIZED:不进行分词。

Index.NO_NORMS:不存储标准化(normalization)数据。

注意:通常文本字段进行tokenized,时间字段不进行tokenized

Fields used for sorting must not be tokenized.(进行排序的自动必须tokenized

termVector:用来进行相似搜索。

 

4.3. Analysis

The default analyzer class used to index tokenized fields is configurable through thehibernate.search.analyzer property. The default value for this property isorg.apache.lucene.analysis.standard.StandardAnalyzer.

在同一个实体类中使用不同的Analysis是不推荐的。

 

4.4. Bridges

In Lucene all index fields have to be represented as strings. All entity properties annotated with @Field have to be converted to strings to be indexed. The reason we have not mentioned it so far is, that for most of your properties Hibernate Search does the translation job for you thanks to set of built-in bridges. However, in some cases you need a more fine grained control over the translation process.

Lucene中所有的字段被转化成相应的字符串,所有被@Field注解的字段都转换成字符串然后被索引。到目前为止我们忽略这些转换的原因是由于Hibernate Search内置的转换桥(built-in bridges)在工作。但是有些时候你需要更细粒度的控制转换的过程。

4.4.1. Built-in bridges

内置转换桥包括:nullStringDate,数值类型,urlclass

Hibernate Search comes bundled with a set of built-in bridges between a Java property type and its full text representation.

Hibernate Search 内部绑定了一些java类属性和它们对于的文本之间的转换桥。

java.lang.String

Strings are indexed as are

short, Short, integer, Integer, long, Long, float, Float, double, Double, BigInteger, BigDecimal Numbers are converted into their string representation. Note that numbers cannot be compared by Lucene (ie used in ranged queries) out of the box: they have to be padded

Using a Range query is debatable and has drawbacks, an alternative approach is to use a Filter query which will filter the result query to the appropriate range.

Hibernate Search will support a padding mechanism

数值被转化成了字符串,使用数值进行范围搜索是不推荐和具有缺陷的,可以使用过滤器解决范围搜索的问题。

 

java.utils.Date

Dates are stored as yyyyMMddHHmmssSSS in GMT time (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You shouldn't really bother with the internal format. What is important is that when using a DateRange Query, you should know that the dates have to be expressed in GMT time.

Usually, storing the date up to the millisecond is not necessary. @DateBridge defines the appropriate resolution you are willing to store in the index ( @DateBridge(resolution=Resolution.DAY) ). The date pattern will then be truncated accordingly.

@Field(index=Index.UN_TOKENIZED)
    @DateBridge(resolution=Resolution.MINUTE)
private Date date;
时间被保存到毫秒级别是没有意义的,DateBridge提供了相应的解决方案,可以精确的DayMinute以进行时间范围的搜索,时间的日期也相应的被缩减(不需要的精度被抛弃)

java.net.URI, java.net.URL

URI and URL are converted to their string representation

java.lang.Class

Class are converted to their fully qualified class name. The thread context classloader is used when the class is rehydrated

4.4.2. Custom bridges

Sometimes, the built-in bridges of Hibernate Search do not cover some of your property types, or the String representation used by the bridge does not meet your requirements. The following paragraphs describe several solutions to this problem.

有些时候内置桥不能转换你的实体类字段,或者这些转换不能满足你的要求。下面的段落将阐述几种转换的方法来解决这个问题。

4.4.2.1. StringBridge

是不是可以有附件上传时,设置字段,内容是附件的文本,以便进行附件的检索?

The simplest custom solution is to give Hibernate Search an implementation of your expected Object toString bridge. To do so you need to implement the org.hibernate.search.bridge.StringBridge interface. All implementations have to be thread-safe as they are used concurrently.

最简单的客户解决方案就是实现你所需要的Object转换成字符串的转换桥。要这样做你需要实现“org.hibernate.search.bridge.StringBridge”接口。所有的实训必须是线程安全的因为它们被并发使用。

Example 4.15. Custom StringBridge implementation

/**
 * Padding Integer bridge.
 * All numbers will be padded with 0 to match 5 digits
 *
 * @author Emmanuel Bernard
 */
public class PaddedIntegerBridge implements StringBridge {
 
    private int PADDING = 5;
 
    public String objectToString(Object object) {
        String rawInteger = ( (Integer) object ).toString();
        if (rawInteger.length() > PADDING) 
            throw new IllegalArgumentException( "Try to pad on a number too big" );
        StringBuilder paddedInteger = new StringBuilder( );
        for ( int padIndex = rawInteger.length() ; padIndex < PADDING ; padIndex++ ) 
     {
            paddedInteger.append('0');
        }
        return paddedInteger.append( rawInteger ).toString();
    }
}                

Given the string bridge defined in Example 4.15, “Custom StringBridge implementation”, any property or field can use this bridge thanks to the @FieldBridge annotation:

上面的用户自定义的字符串转换桥可以通过@FieldBridge注解应用在所有的字段上,如:

@FieldBridge(impl = PaddedIntegerBridge.class)
private Integer length;

 

4.4.2.1.1. Parameterized bridge

Parameters can also be passed to the bridge implementation making it more flexible. Example 4.16, “Passing parameters to your bridge implementation” implements a ParameterizedBridge interface and parameters are passed through the @FieldBridge annotation.

可以通过传递参数是转换桥更具灵活性,这样需要实现“ParameterizedBridge”接口,然后通过@FieldBridge注解传递参数。

示例如下:

Example 4.16. Passing parameters to your bridge implementation

public class PaddedIntegerBridge implements StringBridge, ParameterizedBridge {
 
    public static String PADDING_PROPERTY = "padding";
    private int padding = 5; //default
 
    public void setParameterValues(Map parameters) {
        Object padding = parameters.get( PADDING_PROPERTY );
        if (padding != null) this.padding = (Integer) padding;
    }
 
    public String objectToString(Object object) {
        String rawInteger = ( (Integer) object ).toString();
        if (rawInteger.length() > padding) 
            throw new IllegalArgumentException( "Try to pad on a number too big" );
        StringBuilder paddedInteger = new StringBuilder( );
        for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) 
     {
            paddedInteger.append('0');
        }
        return paddedInteger.append( rawInteger ).toString();
    }
}
 
//property
@FieldBridge(impl = PaddedIntegerBridge.class,
             params = @Parameter(name="padding", value="10")
            )
private Integer length;                

The ParameterizedBridge interface can be implemented by StringBridgeTwoWayStringBridge,FieldBridge implementations.

All implementations have to be thread-safe, but the parameters are set during initialization and no special care is required at this stage.

接口“ParameterizedBridge”可以被StringBridgeTwoWayStringBridge,FieldBridge等实现。所有的这些实现必须是线程安全的,但是所有的参数可以在初始化时设置,并且没有需要特别注意的。

4.4.2.1.2. Type aware bridge
line-
分享到:
评论

相关推荐

    jboss-logging-3.3.0.Final.jar源码

    《深入解析jboss-logging-3.3.0.Final.jar源码》 在Java世界里,日志处理是至关重要的一个环节,它为开发者提供了记录应用程序运行过程中的信息、警告和错误的能力。JBoss Logging作为一款强大的日志框架,被广泛...

    hibernate-release-5.0.7.Final.zip

    antlr-2.7.7.jar dom4j-1.6.1.jar geronimo-jta_1.1_spec-1.1.1.jar hibernate-commons-annotations-5.0.1.Final.jar hibernate-core-5.0.7.Final.jar hibernate-jpa-2.1-api-1.0.0.Final.jar jandex-2.0.0.Final....

    hibernate-commons-annotations-3.3.0.ga-sources.jar

    hibernate-commons-annotations-3.3.0.ga-sources.jar hibernate 源码

    jboss-logging-3.3.0.Final.jar.zip

    java jar包,亲测试可用 安全,可以对其进行数字签名,只让能够识别数字签名的用户使用里面的东西。 加快下载速度; 压缩,使文件变小,与ZIP压缩机制完全相同。 ...能够让JAR包里面的文件依赖于统一版本的类文件。...

    cglib-nodep-3.3.0.jar

    cglib代理 实现AOP 。java动态代理 cglib-nodep-3.3.0.jar最新包免费下载,

    alipay-sdk-java-3.3.0.jar

    alipay-sdk-java-3.3.0.jar

    jboss-logging-3.3.0.Final.jar

    《深入解析jboss-logging-3.3.0.Final.jar在Hibernate框架中的应用》 在Java开发领域,日志管理是不可或缺的一部分,它对于系统的调试、监控以及问题排查至关重要。`jboss-logging`是JBoss社区提供的一款强大且灵活...

    atlassian-extras-decoder-v2-3.3.0.jar

    atlassian-extras-decoder-v2-3.3.0是atlassian旗下bamboo产品持续集成插件

    hadoop-eclipse-plugin-3.3.0.jar

    https://blog.csdn.net/weixin_43311978/article/details/105452135 按隔壁老哥的教程做的,我也不知道能不能用 免费

    hibernate-entitymanager-3.3.0.GA

    《Hibernate实体管理器3.3.0.GA详解》 Hibernate Entity Manager,是Hibernate框架中的一个组件,专门用于实现Java Persistence API(JPA),提供了一种面向对象的方式来管理和持久化Java应用程序中的对象。3.3.0....

    atlassian-extras-3.3.0.jar---jira工具包(免费)

    《atlassian-extras-3.3.0.jar:探索Jira工具包的奥秘》 在信息技术领域,软件开发和项目管理是至关重要的环节。为了提高效率和协作,许多工具应运而生,其中Jira作为一款知名的项目管理和问题跟踪系统,受到了广泛...

    android ZXing android-core-3.3.0.jar和core-3.3.0.jar

    在Android平台上,ZXing通常通过集成`android-core-3.3.0.jar`和`core-3.3.0.jar`这两个JAR文件来实现条码识别功能。 `core-3.3.0.jar`是ZXing的核心库,它包含了条码解码的基本算法和数据处理逻辑。这个库不仅适用...

    modeshape-jdbc-local-3.3.0.Final.zip

    这次我们要探讨的是一个名为“Modeshape-jdbc-local-3.3.0.Final.zip”的压缩包,以及其中包含的开源项目——Sterling。这两个项目分别代表了数据库管理和函数式编程两个重要的方向,对于理解和掌握现代Java应用的...

    cross-request-3.3.0.zip

    标题 "cross-request-3.3.0.zip" 暗示了这是一个关于 "cross-request" 库的软件包,版本为3.3.0。在IT领域,"cross-request"通常指的是一个允许开发者进行跨域HTTP请求的工具,这对于前端开发、API测试以及集成测试...

    CHINER-win_v3.3.0.exe

    一款数据库建模工具

    hadoop-3.3.0.tar.gz

    8. **多语言支持**:Hadoop 3.3.0继续支持多语言API,如Java、Python和Scala,同时提供了更多的文档和示例,帮助开发者更容易地集成Hadoop到他们的应用程序中。 在部署Hadoop 3.3.0之前,你需要确保你的Linux系统...

    lucene-core-3.3.0.jar

    lucene-core-3.3.0.jarlucene-core-3.3.0.jar

    mybatis-3-mybatis-3.3.0.zip

    通过解压"Mybatis-3-mybatis-3.3.0.zip",你可以获得完整的MyBatis框架源码,包括核心库、示例项目、文档和测试用例,这对于学习和深入理解MyBatis的工作原理非常有帮助。同时,你可以根据项目需求,将其集成到你的...

    hibernate_jar.zip

    jboss-logging-3.3.0.Final.jar jboss-marshalling-osgi-1.4.10.Final.jar jboss-transaction-api_1.1_spec-1.0.1.Final.jar jgroups-3.6.2.Final.jar mchange-commons-java-0.2.3.4.jar org.osgi.compendium-...

    jboss-logging-3.3.0.Final.jar的源码

    《深入解析jboss-logging-3.3.0.Final.jar源码》 在Java世界里,日志处理是至关重要的部分,它帮助开发者记录应用程序的运行情况,调试错误,以及进行性能分析。JBoss Logging是Red Hat开发的一个强大且灵活的日志...

Global site tag (gtag.js) - Google Analytics