deploy carrot2-webapp
1. download soucre code
#git clone git://github.com/carrot2/carrot2.git
2.compile
#cd carrot2
#ant webapp
3.deploy
#cp tmp/webapp/carrot2-webapp.war /path/to/tomcat/webapps
4.configure carrot2
#cd /path/to/tomcat/webapps/carrot2-webapp/WEB-INF/suites
#mv suite-webapp.xml suite-webapp.xml.old
#cp source-solr.xml suite-webapp.xml
alter it like this:
<component-suite> <sources> <source component-class="org.carrot2.source.solr.SolrDocumentSource" id="solr" attribute-sets-resource="source-solr-attributes.xml"> <label>Solr</label> <title>Solr Search Engine</title> <icon-path>icons/solr.png</icon-path> <mnemonic>s</mnemonic> <description>Solr document source queries an instance of Apache Solr search engine.</description> <example-queries> <example-query>test</example-query> <example-query>solr</example-query> </example-queries> </source> </sources> <include suite="algorithm-lingo.xml"></include> </component-suite>
4. edit source-solr-attributes.xml
<attribute-sets default="overridden-attributes"> <attribute-set id="overridden-attributes"> <value-set> <label>overridden-attributes</label> <attribute key="SolrDocumentSource.serviceUrlBase"> <value type="java.lang.String" value="http://192.168.10.204:8983/inokarticle/clustering"/> </attribute> <attribute key="SolrDocumentSource.solrSummaryFieldName"> <value type="java.lang.String" value="content"/> </attribute> <attribute key="SolrDocumentSource.solrTitleFieldName"> <value type="java.lang.String" value="content"/> </attribute> </value-set> </attribute-set> </attribute-sets>
5. edit algorithm-lingo-attributes.xml algorithm-lingo.xml
----------------------------------------------------
integrate with solr
1. configure solrconfig.xml
a. import related jars
<lib dir="../contrib/clustering/lib/" regex=".*\.jar" /> <lib dir="../dist/" regex="solr-clustering-\d.*\.jar" />
b. add component adn clustering requesthandler
<searchComponent name="clustering" enable="true" class="solr.clustering.ClusteringComponent" > <lst name="engine"> <str name="name">lingo</str> <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str> <str name="carrot.resourcesDir">clustering/carrot2</str> <str name="MultilingualClustering.defaultLanguage">CHINESE_SIMPLIFIED</str> <str name="PreprocessingPipeline.tokenizerFactory">org.carrot2.text.linguistic.DefaultTokenizerFactory</str> </lst> </searchComponent> <requestHandler name="/clustering" startup="lazy" enable="true" class="solr.SearchHandler"> <lst name="defaults"> <bool name="clustering">true</bool> <str name="clustering.engine">lingo</str> <bool name="clustering.results">true</bool> <!-- Field name with the logical "title" of a each document (optional) --> <str name="carrot.title">content</str> <!-- Field name with the logical "URL" of a each document (optional) --> <str name="carrot.url">id</str> <!-- Field name with the logical "content" of a each document (optional) --> <str name="carrot.snippet">content</str> <!-- Apply highlighter to the title/ content and use this for clustering. --> <bool name="carrot.produceSummary">true</bool> <!-- the maximum number of labels per cluster --> <int name="carrot.numDescriptions">5</int> <!-- produce sub clusters --> <bool name="carrot.outputSubClusters">true</bool> <str name="MultilingualClustering.defaultLanguage">CHINESE_SIMPLIFIED</str> <!-- Configure the remaining request handler parameters. --> <str name="defType">edismax</str> <str name="q.alt">*:*</str> <str name="rows">10</str> <str name="fl">*,score</str> </lst> <arr name="last-components"> <str>clustering</str> </arr> </requestHandler>
2.custom chinese tokenizer for clustering
a. modify related carrot souce code and recompile
b. copy related jars and lexicon to solr web lib dir
Details see Apache SOLR and Carrot2 integration strategies 2
References
http://wiki.apache.org/solr/ClusteringComponent
http://www.cnblogs.com/cy163/archive/2010/05/07/1730172.html
http://carrot2.github.io/solr-integration-strategies/carrot2-3.8.0/index.html
http://download.carrot2.org/head/manual/index.html#section.advanced-topics.building-from-source-code
http://www.cnblogs.com/shm10/p/3700604.html
相关推荐
solr的carrot2需要用到的文件solr-integration-strategies-gh-pages carrot3.9webapp,还有tomcat还有solr4.81请自己下载
最新可用已配置好solr的carrot2插件,tomcat里面需配置好solr具体到http://carrot2.github.io/solr-integration-strategies/carrot2-3.8.0/index.html查看
Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook
Spring Data for Apache Solr API。 Spring Data for Apache Solr 开发文档
Apache Solr是一款开源的企业级搜索平台,由Apache软件基金会维护。它是基于Java的,提供了高效、可扩展的全文检索、数据分析和分布式搜索功能。Solr-8.11.1是该软件的一个特定版本,包含了从早期版本到8.11.1的所有...
Apache Solr 是一个开源的全文搜索引擎,由Apache软件基金会维护,是Lucene项目的一部分。它提供了高效、可扩展的搜索和导航功能,广泛应用于企业级的搜索应用中。Solr-8.11.1是该软件的一个特定版本,包含了最新的...
Apache Solr Essentials is a fast-paced guide to help you quickly learn the process of creating a scalable, efficient, and powerful search application. The book starts off by explaining the ...
Apache Solr是一个基于Apache Lucene构建的开源搜索平台。它是一个高性能的企业级搜索引擎,专为全文搜索和搜索应用程序而设计。Solr提供了可扩展、容错和分布式的特点,同时提供了多种接口,包括REST API,使其可以...
This book is for developers who already know how to use Solr and are looking at procuring advanced strategies for improving their search using Solr. This book is also for people who work with ...
apache solr搜索系统的.Net实现
**Apache Solr与Tomcat6搜索引擎** Apache Solr是一个开源的企业级搜索平台,它基于Lucene库,提供了高效、可扩展的全文检索、命中高亮、拼写检查、分类、聚类等多种功能。Solr的核心特性是其强大的索引能力和快速...
Apache Solr是一款强大的开源搜索平台,它被广泛用于构建高效、可扩展的全文搜索引擎。这两本电子书——"Apache Solr High Performance.pdf" 和 "Solr In Action 2013.pdf" 提供了深入的Solr知识,帮助读者理解和...
Apache Solr 是一个开源的企业级搜索平台,由Apache软件基金会维护。版本3.6.1是Solr的一个重要里程碑,提供了稳定性和性能优化。通过深入理解这个版本的源代码,开发者可以更深入地掌握Solr的工作原理,从而更好地...
### Apache Solr Search:一种强大的开源企业搜索解决方案 #### Apache Solr简介 Apache Solr是一款基于Lucene Java搜索引擎库的企业级搜索服务器。它不仅继承了Lucene的强大功能,还在此基础上进行了扩展,提供了...
### Apache Solr Guide 4.7 知识点解析 #### 一、Apache Solr 概述 **Apache Solr** 是一个高性能、基于 Lucene 的全文检索服务系统,广泛应用于互联网企业的搜索服务中。Solr 提供了高度可扩展且稳定的搜索功能,...
### Apache Solr 7.x Mastering Guide:提升、优化与扩展企业级搜索技术详解 #### 知识点一:Apache Solr 7.x 概览 - **版本更新要点**:本书聚焦于Apache Solr 7.x版本的核心特性和新增功能,包括性能提升、稳定性...