一、Elastic Search总结介绍
二、安装运行
- 系统需要安装JRE1.6以上版本
- 到http://www.elasticsearch.org/overview/elkdownloads/下载最新版本的ES,ES可以运行在Windows和Linux上
- 解压运行bin目录底下的elasticsearch.bat即可启动ES。
三、伪分布式运行Elastic Search
1. 复制两份elasticsearch-1.4.0(包括目录),分别改名为elasticsearch-1.4.0_2和elasticsearch-1.4.0_3
2. 在elasticsearch-1.4.0/conf/elasticsearch.yml中添加两行配置
cluster.name: "tom_cluster"
node.name: "tom_node_1"
3. 在elasticsearch-1.4.0_2/conf/elasticsearch.yml中添加四行配置
cluster.name: tom_cluster
node.name: "tom_node_2"
transport.tcp.port: 9302
http.port: 9202
4. 在elasticsearch-1.4.0_3/conf/elasticsearch.yml中添加四行配置
cluster.name: tom_cluster
node.name: "tom_node_3"
transport.tcp.port: 9303
http.port: 9203
5.分别运行三个目录/bin/elasticsearch.bat,启动集群
四、Elastic Search安装插件
Elastic Search提供了插件可扩展机制,http://www.searchtech.pro/elasticsearch-plugins提供了一个详细的列表
1.在bin目录下,执行plugin.bat -install mobz/elasticsearch-head
2.启动elasticsearch,访问http://localhost:9200/_plugin/head
五、理解Elastic Search的两个术语:分片和复本集
Shards & Replicas
An index can potentially store a large amount of data that can exceed the hardware limits of a single node. For example, a single index of a billion documents taking up 1TB of disk space may not fit on the disk of a single node or may be too slow to serve search requests from a single node alone.
To solve this problem, Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. When you create an index, you can simply define the number of shards that you want. Each shard is in itself a fully-functional and independent "index" that can be hosted on any node in the cluster.
Sharding is important for two primary reasons:
- It allows you to horizontally split/scale your content volume
- It allows you distribute and parallelize operations across shards (potentially on multiple nodes) thus increasing performance/throughput
The mechanics of how a shard is distributed and also how its documents are aggregated back into search requests are completely managed by Elasticsearch and is transparent to you as the user.
In a network/cloud environment where failures can be expected anytime, it is very useful and highly recommended to have a failover mechanism in case a shard/node somehow goes offline or disappears for whatever reason. To this end, Elasticsearch allows you to make one or more copies of your index’s shards into what are called replica shards, or replicas for short.
Replication is important for two primary reasons:
- It provides high availability in case a shard/node fails. For this reason, it is important to note that a replica shard is never allocated on the same node as the original/primary shard that it was copied from.
- It allows you to scale out your search volume/throughput since searches can be executed on all replicas in parallel.
To summarize, each index can be split into multiple shards. An index can also be replicated zero (meaning no replicas) or more times. Once replicated, each index will have primary shards (the original shards that were replicated from) and replica shards (the copies of the primary shards). The number of shards and replicas can be defined per index at the time the index is created. After the index is created, you may change the number of replicas dynamically anytime but you cannot change the number shards after-the-fact.
By default, each index in Elasticsearch is allocated 5 primary shards and 1 replica which means that if you have at least two nodes in your cluster, your index will have 5 primary shards and another 5 replica shards (1 complete replica) for a total of 10 shards per index.
意思是说,一个Node中默认有5个Primary Shard,并且一个Node就是是一个replica,或者说replica=5shards?
启动elasticsearch后,通过REST client添加一条数据,
- url:http://localhost:9200/index1/col1/1
- method:POST
- body
{"a":1, "b":2}
执行完成后,访问http://localhost:9200/_plugin/head查看当前的节点状态,可见节点有个绿色的分片,编号为0-4,只有其中一个分片有数据(包含了刚才新建的这个索引数据)。
上面的操作执行10遍,即,再请求http://localhost:9200/index1/col1/2,http://localhost:9200/index1/col1/10.执行完成后,访问http://localhost:9200/_plugin/head查看当前的节点状态,此时五个分片都有数据,三个分片每个分片2条数据,1个分片1条数据,1个分片3条数据,总共10条数据。所以,在一个节点内部,数据存储是以分片作为更新力度的单位进行保存,也就是说,一个index会分散到不同的分片里面去。
关于分片和副本的概念,可以参考:http://blog.sematext.com/2012/05/29/elasticsearch-shard-placement-control/
- 大小: 16 KB
分享到:
相关推荐
**Elasticsearch 入门到精通** Elasticsearch 是一个高度可扩展的开源全文搜索引擎,设计用于处理大量数据,提供实时分析和搜索功能。它基于 Lucene 库,但提供了更高级别的分布式、RESTful 风格的搜索和数据分析...
**Elasticsearch 入门与实战** Elasticsearch 是一个基于 Lucene 的开源全文搜索引擎,以其分布式、可扩展性、实时搜索以及强大的数据分析能力而受到广泛欢迎。它不仅支持文本搜索,还可以处理结构化和非结构化数据...
Elaticsearch,简称为es, es是一个开源的高扩展的分布式全文检索引擎,它可以近乎实时的存储、检索数据;本身扩展性很好,可以扩展到上百台服务器,处理PB级别的数据。es也使用Java开发并使用Lucene作为其核心来...
### Elasticsearch入门教程知识点详解 #### 一、Elasticsearch安装与基本操作 1. **解压目录结构**: - 在解压后的Elasticsearch目录中,通常包含多个子目录和文件,例如`bin`目录包含了启动脚本,`config`目录...
"Elasticsearch 入门操作" Elasticsearch 是一个基于 Lucene 库的搜索引擎,提供了一个分布式、支持多用户的全文搜索引擎,具有 HTTP Web 接口和无模式 JSON 文档。所有其他语言可以使用 RESTful API 通过端口 9200...
全套 elasticsearch从入门到精通到运维 基于ES5.6版本 有视频 文档 快速上手
Elasticsearch 入门
这个"**ElasticSearch入门和基础(高清视频教程)**"显然旨在为初学者提供一个全面了解和学习Elasticsearch的平台。在视频教程中,你可能会学到以下几个关键知识点: 1. **Elasticsearch的基本概念**:包括其分布式...
### ElasticSearch 入门知识点详解 #### 一、ElasticSearch 概览 **1.1 ElasticSearch 的使用案例** ElasticSearch 在多个领域有着广泛的应用案例,这充分证明了其在大规模数据处理和搜索方面的强大能力。 - **...
Elasticsearch 入门讲解 1. ELASTICSEARCH 初识 Elasticsearch(简称ES)是一款基于Lucene的开源分布式搜索引擎,以其强大的全文检索、实时分析和高可扩展性而闻名。它不仅用于传统的搜索功能,还广泛应用于日志...
【Elasticsearch 入门详解】 Elasticsearch 是一款基于 Lucene 的开源全文搜索引擎,它以 RESTful 风格的 API 进行交互,具备分布式、可扩展、实时搜索和数据分析的能力。作为企业级搜索引擎,Elasticsearch 可轻松...
"ElasticSearch 入门篇" ElasticSearch 是一个基于 Lucene 的搜索服务器,提供了一个分布式多用户能力的全文搜索引擎,基于 RESTful web 接口。ElasticSearch 是用 Java 开发的,并作为 Apache 许可条款下的开放...
Elasticsearch入门篇(一、基本概念) Elasticsearch是一个近实时的搜索平台,它意味着从索引文档的时间到可搜索的时间之间存在轻微的延迟(通常为一秒)。在Elasticsearch中,集群(cluster)是由一个或多个节点...
**Elasticsearch 入门教程与应用场景** Elasticsearch 是一个开源的全文搜索引擎,基于 Lucene 库构建,设计用于分布式、实时的数据存储和搜索。它不仅提供了强大的全文搜索功能,还支持聚合分析,广泛应用于日志...
这篇入门学习笔记将引导初学者了解如何安装、配置以及使用Elasticsearch。 首先,让我们从安装开始。要安装Elasticsearch,你可以访问官方网站(https://www.elastic.co/cn/downloads/elasticsearch)下载最新版本...
Elasticsearch 入门教程
### Elasticsearch入门知识点详解 #### 一、Elasticsearch简介 - **定义与特点**:Elasticsearch是一款基于Lucene的开源搜索和分析引擎,适用于全文检索、结构化数据存储及实时数据分析等多种场景。它能够处理PB...
Elasticsearch是一款强大的开源搜索引擎,尤其在大数据领域中被广泛应用。它基于Lucene库构建,提供了分布式、实时、高可扩展的搜索和分析能力。Elasticsearch不仅支持全文检索,还具备丰富的数据分析和可视化功能,...
Elasticsearch入门基础学习 带你走近一个生产环境的Elasticsearch 用详尽的概念、底层的原理、生动的案例、解开Elasticsearch神秘面纱