- 浏览: 2551262 次
- 性别:
- 来自: 成都
文章分类
最新评论
-
nation:
你好,在部署Mesos+Spark的运行环境时,出现一个现象, ...
Spark(4)Deal with Mesos -
sillycat:
AMAZON Relatedhttps://www.godad ...
AMAZON API Gateway(2)Client Side SSL with NGINX -
sillycat:
sudo usermod -aG docker ec2-use ...
Docker and VirtualBox(1)Set up Shared Disk for Virtual Box -
sillycat:
Every Half an Hour30 * * * * /u ...
Build Home NAS(3)Data Redundancy -
sillycat:
3 List the Cron Job I Have>c ...
Build Home NAS(3)Data Redundancy
SOLR Cloud(2)SOLR7 Single Instance
I download the file from official website solr-7.0.0.tgz
Unzip that and place in the working directory.
The logging is here
/opt/solr/server/logs
Here is the command to Start that Service
>bin/solr restart
>bin/solr start
>bin/solr stop
Some of the Guide information is here:
file:///Users/carl/install/solr-7.0.0/README.txt
In our situation, our core named job, so I manually create a directory job
>mkdir /opt/solr/server/solr/job
I copied all the configurations from our old project and the sample configuration
drwxr-xr-x 3 carl staff 102 Sep 27 14:18 conf
-rw-r--r-- 1 carl staff 126 Sep 27 13:56 core.properties
drwxr-xr-x 5 carl staff 170 Sep 27 13:56 data
-rw-r--r-- 1 carl staff 0 Sep 27 14:17 index_synonyms.txt
-rw-r--r-- 1 carl staff 0 Sep 27 14:16 index_synonyms_case_sensitive.txt
drwxr-xr-x 40 carl staff 1360 Sep 27 13:56 lang
-rw-r--r--@ 1 carl staff 50880 Sep 27 13:55 managed-schema.bak
-rw-r--r--@ 1 carl staff 308 Sep 27 13:55 params.json
-rw-r--r-- 1 carl staff 0 Sep 27 14:18 protected_words.txt
-rw-r--r--@ 1 carl staff 873 Sep 27 13:55 protwords.txt
-rw-r--r-- 1 carl staff 25545 Sep 27 14:26 schema.xml
-rw-r--r--@ 1 carl staff 55062 Sep 27 14:29 solrconfig.xml
-rw-r--r--@ 1 carl staff 781 Sep 27 13:55 stopwords.txt
-rw-r--r--@ 1 carl staff 1124 Sep 27 13:55 synonyms.txt
The sample configuration is here
>cd /opt/solr/server/solr/configsets/_default/conf
drwxr-xr-x@ 40 carl staff 1360 Sep 8 14:34 lang
-rw-r--r--@ 1 carl staff 50880 Sep 8 14:34 managed-schema
-rw-r--r--@ 1 carl staff 308 Sep 8 14:34 params.json
-rw-r--r--@ 1 carl staff 873 Sep 8 14:34 protwords.txt
-rw-r--r--@ 1 carl staff 54994 Sep 8 14:36 solrconfig.xml
-rw-r--r--@ 1 carl staff 781 Sep 8 14:34 stopwords.txt
-rw-r--r--@ 1 carl staff 1124 Sep 8 14:34 synonyms.txt
And I renamed the original managed-schema and I copied the schema.xml from my old project
I click create a core on the UI from http://localhost:8983/
On the page, sometimes it keep throw exceptions, I just follow the guide from google results and clean them.
Exceptions:
fieldType 'booleans' not found in the schema
Solution:
https://stackoverflow.com/questions/31320696/solr-error-creating-core-fieldtype-x-not-found-in-the-schema
>grep "booleans" *
>vi solrconfig.xml
Just comments out the lines in solrconfig.xml
Restart the SOLR, then everything looks good.
Sample schema.xml for references:
<?xml version="1.0" encoding="UTF-8" ?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!--
This is the Solr schema file. This file should be named "schema.xml" and
should be in the conf directory under the solr home
(i.e. ./solr/conf/schema.xml by default)
or located where the classloader for the Solr webapp can find it.
This example schema is the recommended starting point for users.
It should be kept correct and concise, usable out-of-the-box.
For more information, on how to customize this file, please see
http://wiki.apache.org/solr/SchemaXml
PERFORMANCE NOTE: this schema includes many optional features and should not
be used for benchmarking. To improve performance one could
- set stored="false" for all fields possible (esp large fields) when you
only need to search on the field but don't need to return the original
value.
- set indexed="false" if you don't need to search on the field, but only
return the field as a result of searching on other indexed fields.
- remove all unneeded copyField statements
- for best index size and searching performance, set "index" to false
for all general text fields, use copyField to copy them to the
catchall "text" field, and use that for searching.
- For maximum indexing performance, use the ConcurrentUpdateSolrServer
java client.
- Remember to run the JVM in server mode, and use a higher logging level
that avoids logging every request
-->
<schema name=“sillycat" version="1.5">
<!-- attribute "name" is the name of this schema and is only used for display purposes.
version="x.y" is Solr's version number for the schema syntax and
semantics. It should not normally be changed by applications.
1.0: multiValued attribute did not exist, all fields are multiValued
by nature
1.1: multiValued attribute introduced, false by default
1.2: omitTermFreqAndPositions attribute introduced, true by default
except for text fields.
1.3: removed optional field compress feature
1.4: autoGeneratePhraseQueries attribute introduced to drive QueryParser
behavior when a single string produces multiple tokens. Defaults
to off for version >= 1.4
1.5: omitNorms defaults to true for primitive field types
(int, float, boolean, string...)
-->
<!-- Valid attributes for fields:
name: mandatory - the name for the field
type: mandatory - the name of a field type from the
<types> fieldType section
indexed: true if this field should be indexed (searchable or sortable)
stored: true if this field should be retrievable
docValues: true if this field should have doc values. Doc values are
useful for faceting, grouping, sorting and function queries. Although not
required, doc values will make the index faster to load, more
NRT-friendly and more memory-efficient. They however come with some
limitations: they are currently only supported by StrField, UUIDField
and all Trie*Fields, and depending on the field type, they might
require the field to be single-valued, be required or have a default
value (check the documentation of the field type you're interested in
for more information)
multiValued: true if this field may contain multiple values per document
omitNorms: (expert) set to true to omit the norms associated with
this field (this disables length normalization and index-time
boosting for the field, and saves some memory). Only full-text
fields or fields that need an index-time boost need norms.
Norms are omitted for primitive (non-analyzed) types by default.
termVectors: [false] set to true to store the term vector for a
given field.
When using MoreLikeThis, fields used for similarity should be
stored for best performance.
termPositions: Store position information with the term vector.
This will increase storage costs.
termOffsets: Store offset information with the term vector. This
will increase storage costs.
required: The field is required. It will throw an error if the
value does not exist
default: a value that should be used if no value is specified
when adding a document.
-->
<!-- field names should consist of alphanumeric or underscore characters only and
not start with a digit. This is not currently strictly enforced,
but other field names will not have first class support from all components
and back compatibility is not guaranteed. Names with both leading and
trailing underscores (e.g. _version_) are reserved.
-->
<!-- If you remove this field, you must _also_ disable the update log in solrconfig.xml
or Solr won't start. _version_ and update log are required for SolrCloud
-->
<field name="_version_" type="long" indexed="true" stored="true"/>
<!-- points to the root document of a block of nested documents. Required for nested
document support, may be removed otherwise
-->
<field name="_root_" type="string" indexed="true" stored="false"/>
<field name="id" type="string" indexed="true" stored="true" required="true"/>
<field name="customer_id" type="int" indexed="true" stored="true" required="true"/>
<field name="pool_id" type="int" indexed="true" stored="true" required="true"/>
<field name="source_id" type="int" indexed="true" stored="true" required="true"/>
<field name="campaign_id" type="int" indexed="true" stored="true" required="true"/>
<field name="segment_id" type="int" indexed="true" stored="true" required="false"/>
<field name="job_reference" type="string" indexed="true" stored="true" required="true"/>
<field name="title" type="title_text_en_splitting" indexed="true" stored="true" termVectors="true" storeOffsetsWithPositions="true"/>
<field name="description" type="text_en_splitting" indexed="false" stored="false"/>
<field name="description_txt" type="text_en_splitting" indexed="true" stored="true" storeOffsetsWithPositions="true"/>
<field name="url" type="string" indexed="false" stored="true"/>
<field name="company_id" type="int" indexed="true" stored="true"/>
<field name="company" type="text_general" indexed="true" stored="true"/>
<field name="cities" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="city_id" type="int" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="state_id" type="int" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="zipcode" type="int" indexed="true" stored="true" multiValued="true"/>
<field name="cpc" type="int" indexed="true" stored="true"/>
<field name="reg_cpc" type="int" indexed="true" stored="true"/>
<field name="posted" type="date" indexed="true" stored="true"/>
<field name="created" type="date" indexed="true" stored="true"/>
<field name="experience" type="int" indexed="true" stored="true"/>
<field name="salary" type="int" indexed="true" stored="true"/>
<field name="education" type="int" indexed="true" stored="true"/>
<field name="jobtype" type="int" indexed="true" stored="true"/>
<field name="industry" type="int" indexed="true" stored="true" docValues="true"/>
<field name="industries" type="int" indexed="true" stored="true" multiValued="true" docValues="true"/>
<field name="quality_score" type="float" indexed="true" stored="true"/>
<field name="boost_factor" type="float" indexed="true" stored="true"/>
<!-- top spot -->
<field name="is_ad" type="int" indexed="true" stored="true" required="false"/>
<field name="top_spot_industries" type="int" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="top_spot_type" type="int" indexed="true" stored="true" multiValued="false" required="false"/>
<field name="top_spot_preview" type="boolean" indexed="false" stored="true" multiValued="false" required="false"/>
<!-- end of top spot -->
<field name="paused" type="boolean" indexed="true" stored="true"/>
<field name="budget" type="int" indexed="false" stored="true"/>
<field name="email" type="string" indexed="false" stored="true"/>
<field name="phone" type="string" indexed="true" stored="true"/>
<field name="tags" type="string" indexed="false" stored="true" multiValued="true" required="false"/>
<field name="searchtags" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="daily_capped" type="boolean" indexed="true" stored="true"/>
<field name="qq_multiplier" type="float" indexed="false" stored="true" required="false"/>
<field name=“sillycat_apply" type="boolean" indexed="false" stored="true" required="false"/>
<!-- Contains lat+lons derived from cities -->
<field name="jlocation" type="location_rpt" indexed="true" stored="true" multiValued="true"/>
<field name="excluded_company" type="boolean" indexed="true" stored="true"/>
<field name="mobile_friendly" type="boolean" indexed="true" stored="true"/>
<field name="quality_sensitive" type="boolean" indexed="true" stored="true"/>
<field name="major_category" type="payloads" indexed="true" stored="true" multiValued="true"/>
<field name="minor_category" type="payloads" indexed="true" stored="true" multiValued="true"/>
<field name="cpc_high" type="boolean" indexed="true" stored="true"/>
<field name="affiliate_group" type="int" indexed="true" stored="true" required="false"/>
<field name="default_q" type="boolean" indexed="true" stored="true" required="false"/>
<field name="company_id_two_buckets" type="long" indexed="true" stored="true" required="false"/>
<!-- catchall field, containing all other searchable text fields (implemented
via copyField further on in this schema -->
<field name="text" type="text_en_splitting" indexed="true" stored="false" multiValued="true"/>
<dynamicField name="*_facet" type="string" indexed="true" stored="true" multiValued="true"/>
<!-- Field to use to determine and enforce document uniqueness.
Unless this field is marked with required="false", it will be a required field
-->
<uniqueKey>id</uniqueKey>
<!-- DEPRECATED: The defaultSearchField is consulted by various query parsers when
parsing a query string that isn't explicit about the field. Machine (non-user)
generated queries are best made explicit, or they can use the "df" request parameter
which takes precedence over this.
Note: Un-commenting defaultSearchField will be insufficient if your request handler
in solrconfig.xml defines "df", which takes precedence. That would need to be removed.
<defaultSearchField>text</defaultSearchField> -->
<!-- DEPRECATED: The defaultOperator (AND|OR) is consulted by various query parsers
when parsing a query string to determine if a clause of the query should be marked as
required or optional, assuming the clause isn't already marked by some operator.
The default is OR, which is generally assumed so it is not a good idea to change it
globally here. The "q.op" request parameter takes precedence over this.
<solrQueryParser defaultOperator="OR"/> -->
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<!-- <defaultSearchField>text</defaultSearchField> -->
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<!-- <solrQueryParser defaultOperator="OR"/> -->
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<!-- Above, multiple source fields are copied to the [text] field.
Another way to map multiple source fields to the same
destination field is to use the dynamic field syntax.
copyField also supports a maxChars to copy setting. -->
<!-- <copyField source="*_t" dest="text" maxChars="3000"/> -->
<!-- copy name to alphaNameSort, a field designed for sorting by name -->
<!-- <copyField source="name" dest="alphaNameSort"/> -->
<copyField source="title" dest="text"/>
<copyField source="company" dest="text"/>
<copyField source="searchtags" dest="text"/>
<!-- Above, multiple source fields are copied to the [text] field.
Another way to map multiple source fields to the same
destination field is to use the dynamic field syntax.
copyField also supports a maxChars to copy setting. -->
<!-- field type definitions. The "name" attribute is
just a label to be used by field definitions. The "class"
attribute and any other attributes determine the real
behavior of the fieldType.
Class names starting with "solr" refer to java classes in a
standard package such as org.apache.solr.analysis
-->
<!-- The StrField type is not analyzed, but indexed/stored verbatim.
It supports doc values but in that case the field needs to be
single-valued and either required or have a default value.
-->
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/>
<!-- sortMissingLast and sortMissingFirst attributes are optional attributes are
currently supported on types that are sorted internally as strings
and on numeric types.
This includes "string","boolean", and, as of 3.5 (and 4.x),
int, float, long, date, double, including the "Trie" variants.
- If sortMissingLast="true", then a sort on this field will cause documents
without the field to come after documents with the field,
regardless of the requested sort order (asc or desc).
- If sortMissingFirst="true", then a sort on this field will cause documents
without the field to come before documents with the field,
regardless of the requested sort order.
- If sortMissingLast="false" and sortMissingFirst="false" (the default),
then default lucene sorting will be used which places docs without the
field first in an ascending sort and last in a descending sort.
-->
<!--
Default numeric field types. For faster range queries, consider the tint/tfloat/tlong/tdouble types.
These fields support doc values, but they require the field to be
single-valued and either be required or have a default value.
-->
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and
is a more restricted form of the canonical representation of dateTime
http://www.w3.org/TR/xmlschema-2/#dateTime
The trailing "Z" designates UTC time and is mandatory.
Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
All other components are mandatory.
Expressions can also be used to denote calculations that should be
performed relative to "NOW" to determine the value, ie...
NOW/HOUR
... Round to the start of the current hour
NOW-1DAY
... Exactly 1 day prior to now
NOW/DAY+6MONTHS+3DAYS
... 6 months and 3 days in the future from the start of
the current day
Consult the DateField javadocs for more information.
Note: For faster range queries, consider the tdate type
-->
<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
<!-- solr.TextField allows the specification of custom text analyzers
specified as a tokenizer and a list of token filters. Different
analyzers may be specified for indexing and querying.
The optional positionIncrementGap puts space between multiple fields of
this type on the same document, with the purpose of preventing false phrase
matching across fields.
For more info on customizing your analyzer chain, please see
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
-->
<!-- A general text field that has reasonable, generic
cross-language defaults: it tokenizes with StandardTokenizer,
removes stop words from case-insensitive "stopwords.txt"
(empty by default), and down cases. At query time only, it
also applies synonyms. -->
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- A text field with defaults appropriate for English, plus
aggressive word-splitting and autophrase features enabled.
Adds WordDelimiterFilter to enable splitting and matching of
words on case-change, alpha numeric boundaries, and
non-alphanumeric chars. This means certain compound word
cases will work, for example query "wi fi" will match
document "WiFi" or "wi-fi".
-->
<fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="title_text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
<!-- An alternative geospatial field type new to Solr 4. It supports multiValued and polygon shapes.
For more information about this and other Spatial fields new to Solr 4, see:
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
-->
<fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025" maxDistErr="0.000009" distanceUnits="degrees"/>
<!-- lowercases the entire field value, keeping it as a single token. -->
<fieldType name="lowercase" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<similarity class="solr.SchemaSimilarityFactory"/>
<fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/>
</analyzer>
</fieldtype>
</schema>
References:
https://lucene.apache.org/solr/guide/6_6/using-solrj.html#using-solrj
https://cwiki.apache.org/confluence/display/solr/Solr+JDBC+-+SQuirreL+SQL
http://apache.mirrors.pair.com/lucene/solr/7.0.0/
Solr Cloud
http://sillycat.iteye.com/blog/2394077
SOLR 1 ~ 11
http://sillycat.iteye.com/blog/1526727
http://sillycat.iteye.com/blog/1530915
http://sillycat.iteye.com/blog/1532870
http://sillycat.iteye.com/blog/1532874
http://sillycat.iteye.com/blog/1536361
http://sillycat.iteye.com/blog/2227066
http://sillycat.iteye.com/blog/2227398
http://sillycat.iteye.com/blog/2233155
http://sillycat.iteye.com/blog/2233708
http://sillycat.iteye.com/blog/2233709
http://sillycat.iteye.com/blog/2242558
I download the file from official website solr-7.0.0.tgz
Unzip that and place in the working directory.
The logging is here
/opt/solr/server/logs
Here is the command to Start that Service
>bin/solr restart
>bin/solr start
>bin/solr stop
Some of the Guide information is here:
file:///Users/carl/install/solr-7.0.0/README.txt
In our situation, our core named job, so I manually create a directory job
>mkdir /opt/solr/server/solr/job
I copied all the configurations from our old project and the sample configuration
drwxr-xr-x 3 carl staff 102 Sep 27 14:18 conf
-rw-r--r-- 1 carl staff 126 Sep 27 13:56 core.properties
drwxr-xr-x 5 carl staff 170 Sep 27 13:56 data
-rw-r--r-- 1 carl staff 0 Sep 27 14:17 index_synonyms.txt
-rw-r--r-- 1 carl staff 0 Sep 27 14:16 index_synonyms_case_sensitive.txt
drwxr-xr-x 40 carl staff 1360 Sep 27 13:56 lang
-rw-r--r--@ 1 carl staff 50880 Sep 27 13:55 managed-schema.bak
-rw-r--r--@ 1 carl staff 308 Sep 27 13:55 params.json
-rw-r--r-- 1 carl staff 0 Sep 27 14:18 protected_words.txt
-rw-r--r--@ 1 carl staff 873 Sep 27 13:55 protwords.txt
-rw-r--r-- 1 carl staff 25545 Sep 27 14:26 schema.xml
-rw-r--r--@ 1 carl staff 55062 Sep 27 14:29 solrconfig.xml
-rw-r--r--@ 1 carl staff 781 Sep 27 13:55 stopwords.txt
-rw-r--r--@ 1 carl staff 1124 Sep 27 13:55 synonyms.txt
The sample configuration is here
>cd /opt/solr/server/solr/configsets/_default/conf
drwxr-xr-x@ 40 carl staff 1360 Sep 8 14:34 lang
-rw-r--r--@ 1 carl staff 50880 Sep 8 14:34 managed-schema
-rw-r--r--@ 1 carl staff 308 Sep 8 14:34 params.json
-rw-r--r--@ 1 carl staff 873 Sep 8 14:34 protwords.txt
-rw-r--r--@ 1 carl staff 54994 Sep 8 14:36 solrconfig.xml
-rw-r--r--@ 1 carl staff 781 Sep 8 14:34 stopwords.txt
-rw-r--r--@ 1 carl staff 1124 Sep 8 14:34 synonyms.txt
And I renamed the original managed-schema and I copied the schema.xml from my old project
I click create a core on the UI from http://localhost:8983/
On the page, sometimes it keep throw exceptions, I just follow the guide from google results and clean them.
Exceptions:
fieldType 'booleans' not found in the schema
Solution:
https://stackoverflow.com/questions/31320696/solr-error-creating-core-fieldtype-x-not-found-in-the-schema
>grep "booleans" *
>vi solrconfig.xml
Just comments out the lines in solrconfig.xml
Restart the SOLR, then everything looks good.
Sample schema.xml for references:
<?xml version="1.0" encoding="UTF-8" ?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!--
This is the Solr schema file. This file should be named "schema.xml" and
should be in the conf directory under the solr home
(i.e. ./solr/conf/schema.xml by default)
or located where the classloader for the Solr webapp can find it.
This example schema is the recommended starting point for users.
It should be kept correct and concise, usable out-of-the-box.
For more information, on how to customize this file, please see
http://wiki.apache.org/solr/SchemaXml
PERFORMANCE NOTE: this schema includes many optional features and should not
be used for benchmarking. To improve performance one could
- set stored="false" for all fields possible (esp large fields) when you
only need to search on the field but don't need to return the original
value.
- set indexed="false" if you don't need to search on the field, but only
return the field as a result of searching on other indexed fields.
- remove all unneeded copyField statements
- for best index size and searching performance, set "index" to false
for all general text fields, use copyField to copy them to the
catchall "text" field, and use that for searching.
- For maximum indexing performance, use the ConcurrentUpdateSolrServer
java client.
- Remember to run the JVM in server mode, and use a higher logging level
that avoids logging every request
-->
<schema name=“sillycat" version="1.5">
<!-- attribute "name" is the name of this schema and is only used for display purposes.
version="x.y" is Solr's version number for the schema syntax and
semantics. It should not normally be changed by applications.
1.0: multiValued attribute did not exist, all fields are multiValued
by nature
1.1: multiValued attribute introduced, false by default
1.2: omitTermFreqAndPositions attribute introduced, true by default
except for text fields.
1.3: removed optional field compress feature
1.4: autoGeneratePhraseQueries attribute introduced to drive QueryParser
behavior when a single string produces multiple tokens. Defaults
to off for version >= 1.4
1.5: omitNorms defaults to true for primitive field types
(int, float, boolean, string...)
-->
<!-- Valid attributes for fields:
name: mandatory - the name for the field
type: mandatory - the name of a field type from the
<types> fieldType section
indexed: true if this field should be indexed (searchable or sortable)
stored: true if this field should be retrievable
docValues: true if this field should have doc values. Doc values are
useful for faceting, grouping, sorting and function queries. Although not
required, doc values will make the index faster to load, more
NRT-friendly and more memory-efficient. They however come with some
limitations: they are currently only supported by StrField, UUIDField
and all Trie*Fields, and depending on the field type, they might
require the field to be single-valued, be required or have a default
value (check the documentation of the field type you're interested in
for more information)
multiValued: true if this field may contain multiple values per document
omitNorms: (expert) set to true to omit the norms associated with
this field (this disables length normalization and index-time
boosting for the field, and saves some memory). Only full-text
fields or fields that need an index-time boost need norms.
Norms are omitted for primitive (non-analyzed) types by default.
termVectors: [false] set to true to store the term vector for a
given field.
When using MoreLikeThis, fields used for similarity should be
stored for best performance.
termPositions: Store position information with the term vector.
This will increase storage costs.
termOffsets: Store offset information with the term vector. This
will increase storage costs.
required: The field is required. It will throw an error if the
value does not exist
default: a value that should be used if no value is specified
when adding a document.
-->
<!-- field names should consist of alphanumeric or underscore characters only and
not start with a digit. This is not currently strictly enforced,
but other field names will not have first class support from all components
and back compatibility is not guaranteed. Names with both leading and
trailing underscores (e.g. _version_) are reserved.
-->
<!-- If you remove this field, you must _also_ disable the update log in solrconfig.xml
or Solr won't start. _version_ and update log are required for SolrCloud
-->
<field name="_version_" type="long" indexed="true" stored="true"/>
<!-- points to the root document of a block of nested documents. Required for nested
document support, may be removed otherwise
-->
<field name="_root_" type="string" indexed="true" stored="false"/>
<field name="id" type="string" indexed="true" stored="true" required="true"/>
<field name="customer_id" type="int" indexed="true" stored="true" required="true"/>
<field name="pool_id" type="int" indexed="true" stored="true" required="true"/>
<field name="source_id" type="int" indexed="true" stored="true" required="true"/>
<field name="campaign_id" type="int" indexed="true" stored="true" required="true"/>
<field name="segment_id" type="int" indexed="true" stored="true" required="false"/>
<field name="job_reference" type="string" indexed="true" stored="true" required="true"/>
<field name="title" type="title_text_en_splitting" indexed="true" stored="true" termVectors="true" storeOffsetsWithPositions="true"/>
<field name="description" type="text_en_splitting" indexed="false" stored="false"/>
<field name="description_txt" type="text_en_splitting" indexed="true" stored="true" storeOffsetsWithPositions="true"/>
<field name="url" type="string" indexed="false" stored="true"/>
<field name="company_id" type="int" indexed="true" stored="true"/>
<field name="company" type="text_general" indexed="true" stored="true"/>
<field name="cities" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="city_id" type="int" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="state_id" type="int" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="zipcode" type="int" indexed="true" stored="true" multiValued="true"/>
<field name="cpc" type="int" indexed="true" stored="true"/>
<field name="reg_cpc" type="int" indexed="true" stored="true"/>
<field name="posted" type="date" indexed="true" stored="true"/>
<field name="created" type="date" indexed="true" stored="true"/>
<field name="experience" type="int" indexed="true" stored="true"/>
<field name="salary" type="int" indexed="true" stored="true"/>
<field name="education" type="int" indexed="true" stored="true"/>
<field name="jobtype" type="int" indexed="true" stored="true"/>
<field name="industry" type="int" indexed="true" stored="true" docValues="true"/>
<field name="industries" type="int" indexed="true" stored="true" multiValued="true" docValues="true"/>
<field name="quality_score" type="float" indexed="true" stored="true"/>
<field name="boost_factor" type="float" indexed="true" stored="true"/>
<!-- top spot -->
<field name="is_ad" type="int" indexed="true" stored="true" required="false"/>
<field name="top_spot_industries" type="int" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="top_spot_type" type="int" indexed="true" stored="true" multiValued="false" required="false"/>
<field name="top_spot_preview" type="boolean" indexed="false" stored="true" multiValued="false" required="false"/>
<!-- end of top spot -->
<field name="paused" type="boolean" indexed="true" stored="true"/>
<field name="budget" type="int" indexed="false" stored="true"/>
<field name="email" type="string" indexed="false" stored="true"/>
<field name="phone" type="string" indexed="true" stored="true"/>
<field name="tags" type="string" indexed="false" stored="true" multiValued="true" required="false"/>
<field name="searchtags" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="daily_capped" type="boolean" indexed="true" stored="true"/>
<field name="qq_multiplier" type="float" indexed="false" stored="true" required="false"/>
<field name=“sillycat_apply" type="boolean" indexed="false" stored="true" required="false"/>
<!-- Contains lat+lons derived from cities -->
<field name="jlocation" type="location_rpt" indexed="true" stored="true" multiValued="true"/>
<field name="excluded_company" type="boolean" indexed="true" stored="true"/>
<field name="mobile_friendly" type="boolean" indexed="true" stored="true"/>
<field name="quality_sensitive" type="boolean" indexed="true" stored="true"/>
<field name="major_category" type="payloads" indexed="true" stored="true" multiValued="true"/>
<field name="minor_category" type="payloads" indexed="true" stored="true" multiValued="true"/>
<field name="cpc_high" type="boolean" indexed="true" stored="true"/>
<field name="affiliate_group" type="int" indexed="true" stored="true" required="false"/>
<field name="default_q" type="boolean" indexed="true" stored="true" required="false"/>
<field name="company_id_two_buckets" type="long" indexed="true" stored="true" required="false"/>
<!-- catchall field, containing all other searchable text fields (implemented
via copyField further on in this schema -->
<field name="text" type="text_en_splitting" indexed="true" stored="false" multiValued="true"/>
<dynamicField name="*_facet" type="string" indexed="true" stored="true" multiValued="true"/>
<!-- Field to use to determine and enforce document uniqueness.
Unless this field is marked with required="false", it will be a required field
-->
<uniqueKey>id</uniqueKey>
<!-- DEPRECATED: The defaultSearchField is consulted by various query parsers when
parsing a query string that isn't explicit about the field. Machine (non-user)
generated queries are best made explicit, or they can use the "df" request parameter
which takes precedence over this.
Note: Un-commenting defaultSearchField will be insufficient if your request handler
in solrconfig.xml defines "df", which takes precedence. That would need to be removed.
<defaultSearchField>text</defaultSearchField> -->
<!-- DEPRECATED: The defaultOperator (AND|OR) is consulted by various query parsers
when parsing a query string to determine if a clause of the query should be marked as
required or optional, assuming the clause isn't already marked by some operator.
The default is OR, which is generally assumed so it is not a good idea to change it
globally here. The "q.op" request parameter takes precedence over this.
<solrQueryParser defaultOperator="OR"/> -->
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<!-- <defaultSearchField>text</defaultSearchField> -->
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<!-- <solrQueryParser defaultOperator="OR"/> -->
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<!-- Above, multiple source fields are copied to the [text] field.
Another way to map multiple source fields to the same
destination field is to use the dynamic field syntax.
copyField also supports a maxChars to copy setting. -->
<!-- <copyField source="*_t" dest="text" maxChars="3000"/> -->
<!-- copy name to alphaNameSort, a field designed for sorting by name -->
<!-- <copyField source="name" dest="alphaNameSort"/> -->
<copyField source="title" dest="text"/>
<copyField source="company" dest="text"/>
<copyField source="searchtags" dest="text"/>
<!-- Above, multiple source fields are copied to the [text] field.
Another way to map multiple source fields to the same
destination field is to use the dynamic field syntax.
copyField also supports a maxChars to copy setting. -->
<!-- field type definitions. The "name" attribute is
just a label to be used by field definitions. The "class"
attribute and any other attributes determine the real
behavior of the fieldType.
Class names starting with "solr" refer to java classes in a
standard package such as org.apache.solr.analysis
-->
<!-- The StrField type is not analyzed, but indexed/stored verbatim.
It supports doc values but in that case the field needs to be
single-valued and either required or have a default value.
-->
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/>
<!-- sortMissingLast and sortMissingFirst attributes are optional attributes are
currently supported on types that are sorted internally as strings
and on numeric types.
This includes "string","boolean", and, as of 3.5 (and 4.x),
int, float, long, date, double, including the "Trie" variants.
- If sortMissingLast="true", then a sort on this field will cause documents
without the field to come after documents with the field,
regardless of the requested sort order (asc or desc).
- If sortMissingFirst="true", then a sort on this field will cause documents
without the field to come before documents with the field,
regardless of the requested sort order.
- If sortMissingLast="false" and sortMissingFirst="false" (the default),
then default lucene sorting will be used which places docs without the
field first in an ascending sort and last in a descending sort.
-->
<!--
Default numeric field types. For faster range queries, consider the tint/tfloat/tlong/tdouble types.
These fields support doc values, but they require the field to be
single-valued and either be required or have a default value.
-->
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and
is a more restricted form of the canonical representation of dateTime
http://www.w3.org/TR/xmlschema-2/#dateTime
The trailing "Z" designates UTC time and is mandatory.
Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
All other components are mandatory.
Expressions can also be used to denote calculations that should be
performed relative to "NOW" to determine the value, ie...
NOW/HOUR
... Round to the start of the current hour
NOW-1DAY
... Exactly 1 day prior to now
NOW/DAY+6MONTHS+3DAYS
... 6 months and 3 days in the future from the start of
the current day
Consult the DateField javadocs for more information.
Note: For faster range queries, consider the tdate type
-->
<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
<!-- solr.TextField allows the specification of custom text analyzers
specified as a tokenizer and a list of token filters. Different
analyzers may be specified for indexing and querying.
The optional positionIncrementGap puts space between multiple fields of
this type on the same document, with the purpose of preventing false phrase
matching across fields.
For more info on customizing your analyzer chain, please see
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
-->
<!-- A general text field that has reasonable, generic
cross-language defaults: it tokenizes with StandardTokenizer,
removes stop words from case-insensitive "stopwords.txt"
(empty by default), and down cases. At query time only, it
also applies synonyms. -->
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- A text field with defaults appropriate for English, plus
aggressive word-splitting and autophrase features enabled.
Adds WordDelimiterFilter to enable splitting and matching of
words on case-change, alpha numeric boundaries, and
non-alphanumeric chars. This means certain compound word
cases will work, for example query "wi fi" will match
document "WiFi" or "wi-fi".
-->
<fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="title_text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]+)[,]([0-9]+)" replacement="$1$2"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([,]+)" replacement=" "/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([!?;:*#^\{\}\(\)\[\]_=|\/~<>"&]+)" replacement=" $1 "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms_case_sensitive.txt" ignoreCase="false" expand="false"/>
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protected_words.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
<!-- An alternative geospatial field type new to Solr 4. It supports multiValued and polygon shapes.
For more information about this and other Spatial fields new to Solr 4, see:
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
-->
<fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025" maxDistErr="0.000009" distanceUnits="degrees"/>
<!-- lowercases the entire field value, keeping it as a single token. -->
<fieldType name="lowercase" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<similarity class="solr.SchemaSimilarityFactory"/>
<fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/>
</analyzer>
</fieldtype>
</schema>
References:
https://lucene.apache.org/solr/guide/6_6/using-solrj.html#using-solrj
https://cwiki.apache.org/confluence/display/solr/Solr+JDBC+-+SQuirreL+SQL
http://apache.mirrors.pair.com/lucene/solr/7.0.0/
Solr Cloud
http://sillycat.iteye.com/blog/2394077
SOLR 1 ~ 11
http://sillycat.iteye.com/blog/1526727
http://sillycat.iteye.com/blog/1530915
http://sillycat.iteye.com/blog/1532870
http://sillycat.iteye.com/blog/1532874
http://sillycat.iteye.com/blog/1536361
http://sillycat.iteye.com/blog/2227066
http://sillycat.iteye.com/blog/2227398
http://sillycat.iteye.com/blog/2233155
http://sillycat.iteye.com/blog/2233708
http://sillycat.iteye.com/blog/2233709
http://sillycat.iteye.com/blog/2242558
发表评论
-
Update Site will come soon
2021-06-02 04:10 1677I am still keep notes my tech n ... -
Portainer 2020(4)Deploy Nginx and Others
2020-03-20 12:06 430Portainer 2020(4)Deploy Nginx a ... -
Private Registry 2020(1)No auth in registry Nginx AUTH for UI
2020-03-18 00:56 435Private Registry 2020(1)No auth ... -
Docker Compose 2020(1)Installation and Basic
2020-03-15 08:10 373Docker Compose 2020(1)Installat ... -
VPN Server 2020(2)Docker on CentOS in Ubuntu
2020-03-02 08:04 454VPN Server 2020(2)Docker on Cen ... -
Nginx Deal with OPTIONS in HTTP Protocol
2020-02-15 01:33 356Nginx Deal with OPTIONS in HTTP ... -
PDF to HTML 2020(1)pdftohtml Linux tool or PDFBox
2020-01-29 07:37 405PDF to HTML 2020(1)pdftohtml Li ... -
Elasticsearch Cluster 2019(2)Kibana Issue or Upgrade
2020-01-12 03:25 720Elasticsearch Cluster 2019(2)Ki ... -
Spark Streaming 2020(1)Investigation
2020-01-08 07:19 295Spark Streaming 2020(1)Investig ... -
Hadoop Docker 2019 Version 3.2.1
2019-12-10 07:39 294Hadoop Docker 2019 Version 3.2. ... -
MongoDB 2019(3)Security and Auth
2019-11-16 06:48 241MongoDB 2019(3)Security and Aut ... -
MongoDB 2019(1)Install 4.2.1 Single and Cluster
2019-11-11 05:07 294MongoDB 2019(1) Follow this ht ... -
Monitor Tool 2019(1)Monit Installation and Usage
2019-10-17 08:22 325Monitor Tool 2019(1)Monit Insta ... -
Ansible 2019(1)Introduction and Installation on Ubuntu and CentOS
2019-10-12 06:15 312Ansible 2019(1)Introduction and ... -
Timezone and Time on All Servers and Docker Containers
2019-10-10 11:18 332Timezone and Time on All Server ... -
Kafka Cluster 2019(6) 3 Nodes Cluster on CentOS7
2019-10-05 23:28 283Kafka Cluster 2019(6) 3 Nodes C ... -
K8S Helm(1)Understand YAML and Kubectl Pod and Deployment
2019-10-01 01:21 326K8S Helm(1)Understand YAML and ... -
Rancher and k8s 2019(5)Private Registry
2019-09-27 03:25 362Rancher and k8s 2019(5)Private ... -
Jenkins 2019 Cluster(1)Version 2.194
2019-09-12 02:53 444Jenkins 2019 Cluster(1)Version ... -
Redis Cluster 2019(3)Redis Cluster on CentOS
2019-08-17 04:07 373Redis Cluster 2019(3)Redis Clus ...
相关推荐
Solr Cloud 6.1.0 是 Apache Solr 的一个版本,它是一个开源的企业级搜索平台,被广泛用于构建高效、可扩展的全文检索服务。在这个版本中,它支持拼音分词,使得中文搜索能力得到显著提升。拼音分词是处理中文文本的...
2. FusionInsight Solrtest的用途和如何使用它进行性能测试。 3. Java客户端SolrJ的使用,包括创建SolrClient,索引操作,查询语句构造等。 4. 分布式搜索的概念,特别是SolrCloud的Sharding和Replication机制。 5. ...
使用多线程方式 通过solrj 接口向solr新增索引信息
1. Solr Cloud 2. 函数查询 3. 地理位置查询 4. JSON Facet 章节四:Solr高级(下) 1. 深度分页 2. Solr Join查询 3. 相关度排序 4.Solr缓存 5.Spring Data Solr 章节五:综合案例,电商网站搜索页面 1.关键字搜索...
windows环境 1.伪集群,将压缩包解压后放在以下目录中(任意盘,此处是E:) :E:\solr\solr-5.3.0-cloud 2.修改 solr_home1\bin\solr.in.cmd中的SOLR_HOST 3.运行build脚本
2. 配置schema.xml:在Solr的配置文件中,需要定义字段类型(fieldType)来使用Ik分词器。例如,可以创建一个名为`text_ik`的字段类型,指定其分析器为`org.wltea.analyzer.lucene.IKAnalyzer`。 3. 配置分析器:在...
kafka-solr-sink连接器这是基于Java的简单Solr Sink Kafka连接器,它从kafka主题获取纯json数据并推送到solr,同时支持solr cloud和独立模式。 注意:仅支持JSON数据,对于值转换器,请保留schemas.enable=false 。...
在Solr7这个版本中,IK Analyzer被优化以兼容Solr Cloud模式,使得在分布式环境下也能顺畅地进行中文分词处理。 "兼容solr-cloud"意味着IK Analyzer已经适配了Solr的集群架构。Solr Cloud是Solr的一种分布式部署...
2. 将解压后的JAR文件添加到Solr的lib目录下,或者在Solr的`solrconfig.xml`中指定其位置。 3. 更新Solr的`schema.xml`文件,为需要分词的字段指定`analyzer`元素,并设置为IKAnalyzer。例如: ```xml ...
- **单节点读性能**:在相同条件下,单节点的读性能约为Solr Cloud集群的2倍。 - **单节点写性能**:在相同条件下,单节点的写性能约为Solr Cloud集群的2倍。 - **并发读写性能对比**:虽然单节点在并发读写方面有...
标题“ik-analyzer-solr7-7.x.zip”表明这是一个与Solr7相关的压缩包,其中包含了IK Analyzer,一个广泛使用的中文分词工具。这个压缩包特别为Solr7版本进行了优化,提供了完整的配置文件,使得用户可以方便地集成到...
Solr-SQL为Solr Cloud提供了SQL接口,开发人员可以通过JDBC协议在Solr Cloud上运行。同时,solr-sql是用于solr的Apache Calcite(见 http://calcite.apache.org)适配器。solr-sql 是用 Scala 编写的,它可以生成像 ...
7. **部署至Tomcat**:将解压后的Solr Web应用部署至Tomcat服务器中。 - 示例操作:将`/home/myuser/solr-war/`目录下的文件拷贝至Tomcat的webapps目录。 #### 五、总结 本文详细介绍了如何在Tomcat服务器上...
7. **分析链**:Solr的分析链允许用户自定义输入数据的预处理过程,如分词、标准化和过滤,这在处理不同语言和文本格式时非常有用。 8. **搜索性能优化**:Solr提供了多种优化手段,包括使用倒排索引、缓存策略、...
标题中的"ik-analyzer-solr7(支持solr7)"指的是IK Analyzer,这是一个针对Apache Solr搜索引擎的中文分词插件,专为Solr 7版本进行了优化和适配。IK Analyzer是一款开源的Java实现的中文分词器,旨在提供高效、灵活...
文档标题 "solr7官方文档" 指示了这是针对 Solr 7.x 版本的使用手册,这个版本的 Solr 是目前较为先进稳定的版本,包含了大量功能和性能上的改进。 文档的【描述】部分反复强调“solr 使用官方指南”,这意味着文档...
2. **检查 classes 文件夹**:在 `D:\solr\tomcat7\webapps\solr\WEB-INF` 目录下确认是否存在 classes 文件夹,如果没有,请创建。 #### 六、配置 log4j.properties 文件 1. **复制 log4j.properties 文件**:将 ...
Solr 数据导入调度器(solr-dataimport-scheduler.jar)是一个专门为Apache Solr 7.x版本设计的组件,用于实现数据的定期索引更新。在理解这个知识点之前,我们需要先了解Solr的基本概念以及数据导入处理...
7. 高亮显示:Solr可以高亮显示搜索结果中的关键词,提高用户体验。 三、Solr 6.2.0的改进与新特性 1. 改进的ShardHandler API:增强了对请求的并发处理能力,提高了性能。 2. 引入了新的查询执行模型(Distributed...