`
BlackWing
  • 浏览: 198956 次
  • 性别: Icon_minigender_1
  • 来自: 广州
社区版块
存档分类
最新评论

Solr主从索引复制

    博客分类:
  • solr
阅读更多
摘自官网:
How does the slave replicate?

The master is totally unaware of the slaves. The slave continuously keeps polling the master (depending on the 'pollInterval' parameter) to check the current index version the master. If the slave finds out that the master has a newer version of the index it initiates a replication process. The steps are as follows,Slave issues a filelist command to get the list of the files. This command returns the names of the files as well as some metadata (size,lastmodified,alias if any)
The slave checks with its own index if it has any of those files in the local index. It then proceeds to download the missing files (The command name is 'filecontent' ). This uses a custom format (akin to the HTTP chunked encoding) to download the full content or a part of each file. If the connection breaks in between , the download resumes from the point it failed. At any point, it tries 5 times before giving up a replication altogether.
The files are downloaded into a temp dir. So if the slave or master crashes in between it does not corrupt anything. It just aborts the current replication.
After the download completes, all the new files are 'mov'ed to the slave's live index directory and the files' timestamps will match the timestamps in the master.
A 'commit' command is issued on the slave by the Slave's ReplicationHandler and the new index is loaded.

How are configuration files replicated?

The files that are to be replicated have to be mentioned explicitly in using the 'confFiles' parameter.
Only files in the 'conf' dir of the solr instance are replicated.
The files are replicated only along with a fresh index. That means even if a file is changed in the master the file is replicated only after there is a new commit/optimize on the master.
Unlike the index files, where the timestamp is good enough to figure out if they are identical, conf files are compared against their checksum. The schema.xml files (on master and slave) are same if their checksums match.
Conf files are also downloaded to a temp dir before they are 'mov'ed to the original files. The old files are renamed and kept in the same directory. ReplicationHandler does not automatically clean up these old files.
If a replication involved downloading of at least one conf file a core reload is issued instead of a 'commit' command.
What if I add documents to the slave or if slave index gets corrupted?

If docs are added to the slave, then the slave is not in sync with the master anymore. But, it does not do anything to keep it in sync with master until the master has a newer index. When a commit happens on the master, the index version of the master will become different from that of the slave. The slave fetches the list of files and finds that some of the files (same name) are there in the local index with a different size/timestamp. This means that the master and slave have incompatible indexes. Slave then copies all the files from master (there may be scope to optimize this, but this is a rare case and may not be worth it) to a new index dir and and asks the core to load the fresh index from the new directory.
分享到:
评论

相关推荐

    人工智能-项目实践-搜索引擎-基于solrj开发solr主从搜索引擎服务的dubbo组件

    《基于Solrj开发Solr主从搜索引擎服务的Dubbo组件》 在现代信息技术领域,搜索引擎作为信息检索的重要工具,其高效、精准的搜索能力对于企业和用户来说具有极高的价值。本项目实践聚焦于利用Solrj开发一个支持主从...

    solr(solr-9.0.0-src.tgz)源码

    - **复制与恢复**:Solr支持主从复制,确保数据的一致性和高可用性。 3. **源码结构** 解压后的源码包`solr-9.0.0`包含了以下几个关键部分: - `server`: 存放Solr服务器端代码,如SolrJetty容器、管理API等。 ...

    Hbase同步数据到Solr的方案

    随后,后台的复制线程会将这些变更事件发送到Solr,以创建或更新索引。这一过程与MySQL的主从复制机制类似,每个Region Server都有自己的WAL Log,并且在ZooKeeper中维护同步位置。 操作时序图大致如下: 1. 客户端...

    solr 4.10&

    - **Replication(复制)**:通过主从复制,确保数据的一致性和高可用性,当主节点故障时,从节点可以接管服务。 - **Cloud模式**:通过ZooKeeper协调,支持动态添加和删除节点,实现自动负载均衡和故障恢复。 - ...

    solr教材-PDF版

    - 描述了Solr如何通过主从架构实现数据的复制和分发,从而提高系统的可用性和容错性。 #### 二、Solr的安装与配置 **2.1 在Tomcat下Solr安装** - **2.1.1 安装准备**:介绍安装前需要准备的软件环境和硬件要求。 ...

    solr 企业搜索引擎教程

    - **复制架构**:通过主从复制模式提高系统可用性,支持主节点和多个副本节点之间的数据同步。 #### 7. Solr 的安装与配置 - **安装准备**:需要Java环境、Tomcat容器等。 - **安装过程**:解压Solr包、配置环境...

    开源企业搜索引擎solr的应用教程

    Solr支持数据复制,通过主从复制策略,实现集群的高可用性和数据冗余,保证系统的稳定运行。 1.2.7 管理接口 Solr内置了基于Web的管理界面,方便用户监控和管理索引、日志、系统状态等,简化运维工作。 1.3 Solr...

    开源企业搜索引擎SOLR的 应用教程

    Solr支持主从复制机制,可以在多个节点之间同步索引数据,从而提供高可用性和负载均衡。 - **1.2.7 管理接口** Solr提供了一个Web管理界面,可以方便地监控系统状态、执行管理命令等。 **1.3 Solr服务原理** - *...

    solr配置和solrJ的使用

    - **操作详情**: 从`apache-solr-1.4.1\dist`目录下复制`apache-solr-1.4.1.war`到Tomcat的`webapps`目录,并重命名为`solr.war`。启动Tomcat以解压WAR包,完成后停止Tomcat。 **5. 配置Tomcat环境变量** - **...

    Apache.Solr.3.1.Cookbook 官方推荐英文书籍

    - **主从模式**: 一个主节点负责接收请求并将任务分发给从节点处理。 - **集群模式**: 所有节点平等地参与工作,支持水平扩展。 - **案例**: 设计一套能够支持高并发访问的分布式搜索系统。 #### 四、总结 Apache...

    solr 集群搭建1

    5. **Master-Slave模式**:在Solr 4.x版本中,可以通过主从复制的方式实现数据的备份和高可用性。主节点负责接收写操作,而从节点则从主节点同步数据,用于读操作。 6. **Zookeeper配置**:Zookeeper是Apache ...

Global site tag (gtag.js) - Google Analytics