The begining
There is one thing that you must know – Suggest component is not available in Solr version 1.4.1 and below. To start using this component you need to download 3_x or trunk version from Lucene/Solr SVN.
Configuration
Before we get into the index configuration we need to define an search component. So let’s do it:
view sourceprint?
1.
<searchComponent name="suggest" class="solr.SpellCheckComponent">
2.
<lst name="spellchecker">
3.
<str name="name">suggest</str>
4.
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
5.
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
6.
<str name="field">name_autocomplete</str>
7.
</lst>
8.
</searchComponent>
It is worth mentioning that suggest component is based on solr.SpellCheckComponent and that’s why we can use the above configuration. We have three important attributes in the configuration:
name - name of the component.
lookupImpl – an object that will handle the search. At this point we have two possibilities to use – JasperLookup or TSTLookup. This second one characterizes greater efficiency.
field – the field on the basis of which suggestions are generated.
Now let’s add the appropriate handler:
view sourceprint?
01.
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
02.
<lst name="defaults">
03.
<str name="spellcheck">true</str>
04.
<str name="spellcheck.dictionary">suggest</str>
05.
<str name="spellcheck.count">10</str>
06.
</lst>
07.
<arr name="components">
08.
<str>suggest</str>
09.
</arr>
10.
</requestHandler>
Quite simple configuration, which defines a handler with an additional search component and tell Solr that the maximum number of suggestions returned is 10, this it should use dictionary named suggest (which is actually a Suggest component) which is exactly the same as our defined component.
Index
Let us assume that our document consists of three fields: id, name and description. We want to generate suggestions on the field that hold the name of the product. Our index could look like this:
view sourceprint?
1.
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
2.
<field name="name" type="text" indexed="true" stored="true" multiValued="false" />
3.
<field name="name_autocomplete" type="text_auto" indexed="true" stored="true" multiValued="false" />
4.
<field name="description" type="text" indexed="true" stored="true" multiValued="false" />
In addition, there is the following copy field definition:
view sourceprint?
1.
<copyField source="name" dest="name_autocomplete" />
Suggesting single words
In order to achieve individual words suggestions text_autocomplete type should be defined as follows:
view sourceprint?
1.
<fieldType class="solr.TextField" name="text_auto" positionIncrementGap="100">
2.
<analyzer>
3.
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
4.
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
5.
<filter class="solr.LowerCaseFilterFactory"/>
6.
</analyzer>
7.
</fieldType>
Suggesting phrases
To implement the entire phrase suggestions our text_autocomplete type should be defined as follows:
view sourceprint?
1.
<fieldType class="solr.TextField" name="text_auto">
2.
<analyzer>
3.
<tokenizer class="solr.KeywordTokenizerFactory"/>
4.
<filter class="solr.LowerCaseFilterFactory"/>
5.
</analyzer>
6.
</fieldType>
If you want to use phrases you may want to define your own query converter.
Dictionary building
Before we start using the component, we need to build its index. To this send the following command to Solr:
view sourceprint?
1.
/suggest?spellcheck.build=true
Queries
Now we come to use of the component. In order to show how the use the component, I decided suggest whole phrases. The example query could look like that:
view sourceprint?
1.
/suggest?q=har
After running that query I got the following suggestions:
view sourceprint?
01.
<?xml version="1.0" encoding="UTF-8"?>
02.
<response>
03.
<lst name="responseHeader">
04.
<int name="status">0</int>
05.
<int name="QTime">0</int>
06.
</lst>
07.
<lst name="spellcheck">
08.
<lst name="suggestions">
09.
<lst name="dys">
10.
<int name="numFound">4</int>
11.
<int name="startOffset">0</int>
12.
<int name="endOffset">3</int>
13.
<arr name="suggestion">
14.
<str>hard drive</str>
15.
<str>hard drive samsung</str>
16.
<str>hard drive seagate</str>
17.
<str>hard drive toshiba</str>
18.
</arr>
19.
</lst>
20.
</lst>
21.
</lst>
22.
</response>
The end
In the next part of the autocomplete functionality I’ll show how to modify its configuration to use static dictionary into the mechanism and how this can helk you get better suggestions. The last part of the series will be a performance comparison of each method in which I’ll try to diagnose which method is the fastest one in various situations.
分享到:
相关推荐
开发者需要在Solr端配置相应的自动补全逻辑,通常需要使用Solr的suggester组件,它支持多种搜索建议算法,比如Term Query Suggester、Document Lookup Suggester等。 5. 执行效果。在前端页面配置完成后,当用户...
标题中的“solr7.5_ik分词器,suggest配置源文件”指的是在Solr 7.5版本中使用Ik分词器和Suggest组件进行配置和使用的源文件。Ik分词器是针对中文环境优化的,它包括了多种分词策略,如全模式、最细粒度模式等,可以...
这通常通过配置SpellChecker或Suggester组件来实现,它们可以基于用户的输入历史或者字典文件生成推荐的查询建议。 3. **分组查询**:Solr支持基于某个字段的分组查询,可以将搜索结果按照特定字段(如类别、品牌等...
“Suggester”部分讲解了如何使用Solr的建议器功能,该功能可以为用户输入的查询提供实时建议。 “MoreLikeThis”部分介绍了如何查找与给定文档类似的其他文档。 在“Pagination of Results”部分,解释了如何对...
这通常通过使用Suggester组件实现,它可以基于用户的部分输入提供相关建议。 9. **多格式支持**:Solr支持多种数据输入格式,如JSON、XML、CSV等,方便从各种数据源导入数据。 10. **安全与访问控制**:Solr可以...
Solr的建议组件(Suggester)可以结合pinyinAnalyzer,生成基于拼音的自动补全建议,提升用户交互体验。例如,当用户输入“shang”时,系统能快速返回包含“上海”、“商品”等匹配结果。 总结来说,pinyinAnalyzer...
这通常通过创建一个专门的Suggester或使用CompletionSuggester来实现,它们可以在用户输入时即时提供匹配的建议。 十、文本分析 Analyzer在Lucene中起着至关重要的作用,它负责将原始文本转换为可搜索的术语。...