In a non relational database system, joins can miss. Fortunately, Elasticsearch provides solutions to meet these needs :
Array Type
Read the doc on elasticsearch.org
As its name suggests, it can be an array of native types (string, int, …) but also an array of objects (the basis used for “objects” and “nested”).
Here are some valid indexing examples :
{ "Article" : [ { "id" : 12 "title" : "An article title", "categories" : [1,3,5,7], "tag" : ["elasticsearch", "symfony",'Obtao'], "author" : [ { "firstname" : "Francois", "surname": "francoisg", "id" : 18 }, { "firstname" : "Gregory", "surname" : "gregquat" "id" : "2" } ] } }, { "id" : 13 "title" : "A second article title", "categories" : [1,7], "tag" : ["elasticsearch", "symfony",'Obtao'], "author" : [ { "firstname" : "Gregory", "surname" : "gregquat", "id" : "2" } ] } }
You can find different Array :
- Categories : array of integers
- Tags : array of strings
- author : array of objects (inner objects or nested)
We explicitely specify this “simple” type as it can be more easy/maintainable to store a flatten value rather than the complete object.
Using a non relational structure should make you think about a specific model for your search engine :
- To filter : If you just want to filter/search/aggregate on the textual value of an object, then flatten the value in the parent object.
- To get the list of objects that are linked to a parent (and if you do not need to filter or index these objects), just store the list of ids and hydrate them with Doctrine and Symfony (in French for the moment).
Inner objects
The inner objects are just the JSON object association in a parent. For example, the “authors” in the above example. The mapping for this example could be :
fos_elastica: clients: default: { host: %elastic_host%, port: %elastic_port% } indexes: blog : types: article : mappings: title : ~ categories : ~ tag : ~ author : type : object properties : firstname : ~ surname : ~ id : type : integer
You can Filter or Query on these “inner objects”. For example :
query: author.firstname=Francois
will return the post with the id 12 (and not the one with the id 13).
You can read more on the Elasticsearch website
Inner objects are easy to configure. As Elasticsearch documents are “schema less”, you can index them without specify any mapping.
The limitation of this method lies in the manner as ElasticSearch stores your data. Reusing the above example, here is the internal representation of our objects :
[ { "id" : 12 "title" : An article title", "categories" : [1,3,5,7], "tag" : ["elasticsearch", "symfony",'Obtao'], "author.firstname" : ["Francois","Gregory"], "author.surname" : ["Francoisg","gregquat"], "author.id" : [18,2] } { "id" : 13 "title" : "A second article", "categories" : [1,7], "tag" : ["elasticsearch", "symfony",'Obtao'], "author.firstname" : ["Gregory"], "author.surname" : ["gregquat"], "author.id" : [2] } ]
The consequence is that the query :
{ "query": { "filtered": { "query": { "match_all": {} }, "filter": { "term": { "firstname": "francois", "surname": "gregquat" } } } } }
author.firstname=Francois AND surname=gregquat
will return the document “12″. In the case of an inner object, this query can by translated as “Who has at least one author.surname = gregquat and one author.firstname=francois”.
To fix this problem, you must use the nested.
Les nested
First important difference : nested must be specified in your mapping.
The mapping looks like an object one, only the type changes :
fos_elastica: clients: default: { host: %elastic_host%, port: %elastic_port% } indexes: blog : types: article : mappings: title : ~ categories : ~ tag : ~ author : type : nested properties : firstname : ~ surname : ~ id : type : integer
This time, the internal representation will be :
[ { "id" : 12 "title" : "An article title", "categories" : [1,3,5,7], "tag" : ["elasticsearch", "symfony",'Obtao'], "author" : [{ "firstname" : "Francois", "surname" : "Francoisg", "id" : 18 }, { "firstname" : "Gregory", "surname" : "gregquat", "id" : 2 }] }, { "id" : 13 "title" : "A second article title", "categories" : [1,7], "tags" : ["elasticsearch", "symfony",'Obtao'], "author" : [{ "firstname" : "Gregory", "surname" : "gregquat", "id" : 2 }] } ]
This time, we keep the object structure.
Nested have their own filters which allows to filter by nested object. If we go on with our example (with the limitation of inner objects), we can write this query :
{ "query": { "filtered": { "query": { "match_all": {} }, "filter": { "nested" : { "path" : "author", "filter": { "bool": { "must": [ { "term" : { "author.firsname": "francois" } }, { "term" : { "author.surname": "gregquat" } } ] } } } } } } }
hi
We can translate it as “Who has an author object whose surname is equal to ‘gregquat’ and whose firstname is ‘francois’”. This query will return no result.
There is still a problem which is penalizing when working with bug objects : when you want to change a single value of the nester, you have to reindex the whole parent document (including the nested).
If the objects are heavy, and often updated, the impact on performances can be important.
To fix this problem, you can use the parent/child associations.
Parent/Child
Parent/child associations are very similar to OneToMany relationships (one parent, several children).
The relationship remains hierarchical : an object type is only associated to one parent, and it’s impossible to create a ManyToMany relationship.
We are going to link our article to a category :
fos_elastica: clients: default: { host: %elastic_host%, port: %elastic_port% } indexes: blog : types: category : mappings : id : ~ name : ~ description : ~ article : mappings: title : ~ tag : ~ author : ~ _routing: required: true path: category _parent: type : "category" identifier: "id" #optional as id is the default value property : "category" #optional as the default value is the type value
When indexing an article, a reference to the Category will also be indexed (category.id).
So, we can index separately categories and article while keeping the references between them.
Like for nested, there are Filters and Queries that allow to search on parents or children :
- Has Parent Filter / Has Parent Query : Filter/query on parent fields, returns children objects. In our case, we could filter articles whose parent category contains “symfony” in his description.
- Has Child Filter / Has Child Query : Filter/query on child fields, returns the parent object. In our case, we could filter Categories for which “francoisg” has written an article.
{ "query": { "has_child": { "type": "article", "query" : { "filtered": { "query": { "match_all": {}}, "filter" : { "term": {"tag": "symfony"} } } } } } }
This query will return the Categories that have at least one article tagged with “symfony”.
The queries are here written in JSON, but are easily transformable into PHP with the Elastica library.
These websites can also be interested to read :
- http://euphonious-intuition.com/2013/02/managing-relations-in-elasticsearch/
- http://www.spacevatican.org/2012/6/3/fun-with-elasticsearch-s-children-and-nested-documents/”
http://obtao.com/blog/2014/04/elasticsearch-advanced-search-and-nested-objects/
相关推荐
es, err := elasticsearch.NewDefaultClient() if err != nil { panic(err) } res, err := es.Index(index, "", &doc, nil) if err != nil { panic(err) } defer res.Body.Close() if res.IsError() {...
赠送jar包:elasticsearch-6.8.3.jar; 赠送原API文档:elasticsearch-6.8.3-javadoc.jar; 赠送源代码:elasticsearch-6.8.3-sources.jar; 赠送Maven依赖信息文件:elasticsearch-6.8.3.pom; 包含翻译后的API文档...
docker pull --platform=arm64 elasticsearch:7.17.8 Elasticsearch 是位于 Elastic Stack 核心的分布式搜索和分析引擎。Logstash 和 Beats 有助于收集、聚合和丰富您的数据并将其存储在 Elasticsearch 中。Kibana ...
赠送jar包:elasticsearch-6.2.3.jar; 赠送原API文档:elasticsearch-6.2.3-javadoc.jar; 赠送源代码:elasticsearch-6.2.3-sources.jar; 赠送Maven依赖信息文件:elasticsearch-6.2.3.pom; 包含翻译后的API文档...
赠送jar包:elasticsearch-6.3.0.jar; 赠送原API文档:elasticsearch-6.3.0-javadoc.jar; 赠送源代码:elasticsearch-6.3.0-sources.jar; 赠送Maven依赖信息文件:elasticsearch-6.3.0.pom; 包含翻译后的API文档...
Spring Data Elasticsearch 5.4.0设计时可能并未考虑到与Elasticsearch 5.4.1的完全兼容,导致在升级Elasticsearch到5.4.1后,系统报出"NoNodeAvailableException"错误,提示无法连接到任何节点。这个问题主要是由于...
**Elasticsearch 入门与实战** Elasticsearch 是一个基于 Lucene 的开源全文搜索引擎,以其分布式、可扩展性、实时搜索以及强大的数据分析能力而受到广泛欢迎。它不仅支持文本搜索,还可以处理结构化和非结构化数据...
Elasticsearch 权威指南 Elasticsearch 是一个实时分布式搜索和分析引擎,它让你以前所未有的速度处理大数据成为可能。它用于全文搜索、结构化搜索、分析以及将这三者混合使用。 Elasticsearch 的优点在于将全文...
赠送jar包:elasticsearch-5.5.1.jar; 赠送原API文档:elasticsearch-5.5.1-javadoc.jar; 赠送源代码:elasticsearch-5.5.1-sources.jar; 赠送Maven依赖信息文件:elasticsearch-5.5.1.pom; 包含翻译后的API文档...
Elasticsearch数据库:Elasticsearch与Kubernetes集成技术教程.pdf Elasticsearch数据库:Elasticsearch在实时日志分析中的应用.pdf Elasticsearch数据库:Elasticsearch在电子商务搜索中的实践.pdf Elasticsearch...
linux系统docker离线镜像elasticsearch-7.17.3镜像资源
ELK:elasticsearch:7.17.6+kibana:7.17.6+logstash:7.17.6 支持操作系统:centos7 中标麒麟、银河麒麟、通信UOS 均已经验证 安装方式,docker安装
You will install and set up Elasticsearch and its related plugins, and handle documents using the Distributed Document Store. You will see how to query, search, and index your data, and perform ...
**Elasticsearch 6.1.0:分布式搜索引擎与分析引擎** Elasticsearch 是一个开源的、基于 Lucene 的全文搜索引擎,它具有分布式、实时、可扩展的特点,被广泛应用于日志分析、实时监控、数据搜索等多个场景。在6.1.0...
使用dockerfile安装elasticsearch:7.8.0
赠送jar包:elasticsearch-6.3.0.jar; 赠送原API文档:elasticsearch-6.3.0-javadoc.jar; 赠送源代码:elasticsearch-6.3.0-sources.jar; 赠送Maven依赖信息文件:elasticsearch-6.3.0.pom; 包含翻译后的API文档...
"ES"是Elasticsearch的简称,而"ELK"栈是指Elasticsearch、Logstash和Kibana的组合,常用于日志管理和分析。这个分词器插件可以与Logstash集成,帮助处理和分析来自各种日志源的数据。 压缩包中的其他文件如`...
赠送jar包:elasticsearch-5.5.1.jar; 赠送原API文档:elasticsearch-5.5.1-javadoc.jar; 赠送源代码:elasticsearch-5.5.1-sources.jar; 赠送Maven依赖信息文件:elasticsearch-5.5.1.pom; 包含翻译后的API文档...