`

Implementing High Availability in MySQL

阅读更多

MySQL provides a built-in data replication functionality for maintaining identical copies of its data to one or more backend servers, thus providing a simple High Availability mechanism. On the other hand, the Open Source community has several projects to implement failover techniques, being one of them Heartbeat.

 

This article will show you how to implement a clustered, highly available and inexpensive solution based on GNU/Linux and combining MySQL as the database engine and Heartbeat as the failover mechanism. The configuration will consist of a 2-node active/passive cluster.

I assume you have MySQL up and running on both nodes and that your are working with MySQL 4.0.13 or above. If not, please refer to MySQL manual here and download a recent copy here .

How replication works in MySQL

Replication in MySQL is very simple: one machine acts as the master server and one or more machines act as the backup servers (the replica servers). The master server keeps all changes made to its databases in binary log files, so the backup server(s) can read these files and apply the changes to its own copy of the data.

In more detail, the binary log file records all the changes (UPDATE, DELETE, INSERT...) made to the master''s databases since the first time the replication was configured and started. The master also creates and maintains an index file to keep track of the binary logs created. Upon connecting, the slave server(s) obtains new updates from the binary log and aplies them to its copy of the data.

Note : As MySQL suggests, visit their website often to check the latest changes and improvements to its database replication implementation.

How Heartbeat works

Heartbeat is a piece of software that provides High Availability features such as monitoring the availability of the machines in the cluster, transferring the virtual IPs (more on this later) in case of failures and starting and stopping services.

The Heartbeat software running on the slave server periodically checks the health of the master server by listening to its heartbeats sent via null modem cable and/or a crossover ethernet cable. Note that in the best scenario slave''s main task is nothing but to monitor the health of its master. In case of a crash the slave will not receive the heartbeats from the master and then it will take over the virtual IPs and the services offered by the master.

The overall picture

Next figure shows the picture of our cluster.

As previously stated, our configuration will consist of a 2-node active/passive cluster: dbserv1, the master server and dbserv2, the slave server. Both machines are linked via serial COM port /dev/ttyS0 (null modem cable) and a crossover ethernet cable (eth0), through which they send its heartbeats to each other.

The 192.168.1.103 IP address at eth1:0 is the floating IP address, the virtual IP. This is the service IP where the master listens to and that will be transferred to the slave in case of a failure in the master. Requests from the application servers will be made through the virtual IP.

Both servers have another IP address that can be used to administer the machines: 192.168.1.101 and 192.168.1.102. Bear in mind that the virtual IP (192.168.1.103) is set up by Heartbeat, meaning that if it is not up and running in the active server there will be no access to the virtual service.

Setting up replication

  1. Create a replication user on the master:
         mysql -u root -p

    At MySQL prompt type:

         GRANT REPLICATION SLAVE ON *.* TO replica@"%" IDENTIFIED BY ''replica_passwd'';
    
  2. Stop MySQL on both the master server and the slave server. Take a snapshot of your databases from the master.
         /etc/init.d/mysql stop
         tar cvzf mysqldb.tgz /path/to/your/databases
    

    In my configuration I would...

         /etc/init.d/mysql stop
         tar cvzf mysqldb.tgz /var/mysql-data/*
    
  3. Copy the data to the slave
         scp /path/to/mysqldb.tgz admin@dbserv2:/path/to/your/databases
    

    If you are using InnoDB tables, copy your tablespace file(s) and associated log files to the slave. In my case, the tablespace is called ibdata and the log files are those ib_* . So:

         scp /var/mysql-data/ibdata admin@dbserv2:/var/mysql-data
         scp /var/log/mysql/ib_* admin@dbserv2:/var/log/mysql
    
  4. Activate the binary log and assign a unique ID to the master:
         vi /etc/my.cnf
    

    Then add/change the following

         [mysqld]
         .....
         # Enable binary logs. Path to bin log is optional
         log-bin=/var/log/mysql/dbserv1
         # If the binary log exceeds 10M, rotate the logs
         max_binlog_size=10M
         # Set master server ID
         server-id=1
         .....
    

    Now you can start mysqld on the master. Watch the logs to see if there are problems.

         /etc/init.d/mysql start
    
  5. Log in on the slave.
         vi /etc/my.cnf
    

    Then add/change the following:

         server-id=2
         # This is eth0. Take a look at figure 1
         master-host=192.168.100.1
         master-user=replica
         master-password=replica_passwd
         # Port that master server is listening to
         master-port=3306
         # Number of seconds before retrying to connect to master. Defaults to 60 secs
         #master-connect-retry
    
  6. Uncompress the databases
         cd /path/to/your/databases
         tar xvzf mysqldb.tgz
    
         chown -R mysql.mysql /path/to/your/databases
    

    Make sure your tablespace file(s) and associated files are in place (/path/to/your/databases in our example).

  7. Start mysqld on the slave. Watch the logs to see if there are problems.
         /etc/init.d/mysql start
    
  8. Check if replication is working. For example, log in on the master, create a database and see if it is replicated on the slave:
         mysql -u root -p
    
         create database replica_test;
         show databases;
    
         +----------------+
         | Database       |
         +----------------+
         | replica_test   |
         | mysql          |
         | test           |
         | tmp            |
         +----------------+
    

    Log in on the slave server and make sure the database replica_test is created:

         mysql -u root -p
         show databases;
    
         +----------------+
         | Database       |
         +----------------+
         | replica_test   |
         | mysql          |
         | test           |
         | tmp            |
         +----------------+
    

    If you have problems, please refer to MySQL manual here .

Installing and setting up Heartbeat

Download a recent copy of Heartbeat from here and then as usual....

     configure
     make
     make install

or:

     rpm -Uhv heartbeat-1.0.4-1.i386.rpm

if you downloaded the RPM based package.

Configuring heartbeat

There are three files involved in the configuration of heartbeat:

  • ha.cf: the main configuration file that describes the machines involved and how they behave.
  • haresources: this configuration file specifies virtual IP (VIP) and services handled by heartbeat.
  • authkeys: specifies authentication keys for the servers.

Sample /etc/ha.d/ha.cf

     # Time between heartbeats in seconds
     keepalive 1
     # Node is pronounced dead after 15 seconds
     deadtime 15
     # Prevents the master node from re-acquiring cluster resources after a failover
     nice_failback on
     # Device for serial heartbeat
     serial  /dev/ttyS0
     # Speed at which to run the serial line (bps)
     baud    19200
     # Port for udp (default)
     udpport 694
     # Use a udp heartbeat over the eth0 interface
     udp     eth0

     debugfile /var/log/ha/ha.debug
     logfile /var/log/ha/ha.log

     # First node of the cluster (must be uname -a)
     node    dbserv1
     # Second node of the cluster (must be uname -a)
     node    dbserv2

Sample /etc/ha.d/haresources

     dbserv1 Ipaddress::192.168.1.103::eth1

This tells Heartbeat to set up 192.168.1.103 as the virtual IP (VIP). See figure above.

Sample /etc/ha.d/authkeys

     auth 1
     1 crc
     2 sha1 HI!
     3 md5 Hello!

This file determines the authentication keys. Must be mode 600. As I assume that our network is relatively secure I configure crc as the authentication method. There is also md5 and sha1 available.

Now start heartbeat on dbserv1 and the on dbserv2, watch the logs, then stop heartbeat on the first node and see what happens on the second node. Start again heartbeat on the first node and stop it on the second and see the logs. If all is okay, you have a 2-node cluster up and running.

What we have

At this point we have a 2-node cluster with certain degree of availability and fault tolerance. Despite this could be a valid solution for non-critical environments, in really critical environments this configuration should be improved.

Advantages

  • The cluster is fault tolerant
  • The cluster is relatively secure
  • There is no single point of failure (comments?)
  • Automatic fail over mechanism
  • Proven and solid OpenSource software for production environment (my experience)
  • Simple and easy to install and configure
  • Easy to administer
  • Inexpensive

Disadvantages

Our cluster presents almost one serious problem in critical environments (i.e. 99,99% availability). As you know, when the master node fails, the standby node takes over the service and the virtual IP address. In this scenario, when the master comes back online again, it will act as the stand-by node (remember nice_failback on from /etc/ha.d/ha.cf ?). As our configuration has not implemented a two-way replication mechanism, the actual master is not generating binary logs and the actual slave is not configured to act as such. There are means to avoid this disadvantage, but this is your homework ;-). Let me know your progress.

As usual, comments are very welcome.

References

原文出处:http://www.karkomaonline.com/article.php/2004012416185184/print

 

-------------------

Some of the useful command :

1. show mysql process:

 

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 2
   User: root
   Host: localhost:32931
     db: NULL
Command: Binlog Dump
   Time: 94
  State: Has sent all binlog to slave; waiting for binlog to
         be updated
   Info: NULL

 And on the slave :

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 10
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 11
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Has read all relay log; waiting for the slave I/O
         thread to update it
   Info: NULL

2. Show status:

mysql> SHOW MASTER STATUS;
mysql> SHOW SLAVE STATUS;

3. STATUS CODES:

The following list shows the most common states you see in the State column for a slave server I/O thread. Beginning with MySQL 4.1.1, this state also appears in the Slave_IO_State column displayed by the SHOW SLAVE STATUS statement. This means that you can get a good view of what is happening by using only SHOW SLAVE STATUS.

  • Connecting to master

    The thread is attempting to connect to the master.

  • Checking master version

    A state that occurs very briefly, immediately after the connection to the master is established.

  • Registering slave on master

    A state that occurs very briefly immediately after the connection to the master is established.

  • Requesting binlog dump

    A state that occurs very briefly, immediately after the connection to the master is established. The thread sends to the master a request for the contents of its binary logs, starting from the requested binary log filename and position.

  • Waiting to reconnect after a failed binlog dump request

    If the binary log dump request failed (due to disconnection), the thread goes into this state while it sleeps, then tries to reconnect periodically. The interval between retries can be specified using the --master-connect-retry option.

  • Reconnecting after a failed binlog dump request

    The thread is trying to reconnect to the master.

  • Waiting for master to send event

    The thread has connected to the master and is waiting for binary log events to arrive. This can last for a long time if the master is idle. If the wait lasts for slave_read_timeout seconds, a timeout occurs. At that point, the thread considers the connection to be broken and make an attempt to reconnect.

  • Queueing master event to the relay log

    The thread has read an event and is copying it to the relay log so that the SQL thread can process it.

  • Waiting to reconnect after a failed master event read

    An error occurred while reading (due to disconnection). The thread is sleeping for master-connect-retry seconds before attempting to reconnect.

  • Reconnecting after a failed master event read

    The thread is trying to reconnect to the master. When connection is established again, the state becomes Waiting for master to send event.

  • Waiting for the slave SQL thread to free enough relay log space

    You are using a non-zero relay_log_space_limit value, and the relay logs have grown until their combined size exceeds this value. The I/O thread is waiting until the SQL thread frees enough space by processing relay log contents so that it can delete some relay log files.

  • Waiting for slave mutex on exit

    A state that occurs briefly as the thread is stopping.

 

 

 

 

评论

相关推荐

    Sql Server 2016 High Availability Unleashed

    Part III Implementing High Availability 4 Failover Clustering 5 SQL Server Clustering 6 SQL Server AlwaysOn and Availability Groups 7 SQL Server Database Snapshots 8 SQL Server Data Replication 9 SQL ...

    VMware vRealize Automation Handbook Implementing Cloud Management in the azw3

    VMware vRealize Automation Handbook Implementing Cloud Management in the Enterprise Environment 英文azw3 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索...

    Implementing.Cloud.Design.Patterns.for.AWS

    Create highly efficient design patterns for scalability, redundancy, and high availability in the AWS Cloud About This Book Create highly robust systems using cloud infrastructure Make web ...

    MySQL 8 Administrator’s Guide

    Topics such as migrating to MySQL 8, MySQL benchmarking, achieving high performance by implementing the indexing techniques, and optimizing your queries are covered in this book. You will also learn ...

    Pro.SQL.Server.Always.On.Availability.Groups.1484220706.epub

    Always On provides a rich set of options that reduce overhead and resource usage and enable database administrators to implement and manage high availability and disaster recovery solutions far more ...

    医院信息系统中Oracle Data Guard部署与实践.pdf

    With the development of new versions of Oracle, implementing high-availability solutions is becoming increasingly convenient and easy. Data Guard is one of the solutions that can be easily ...

    VMware vRealize Automation Handbook Implementing Cloud Management in the epub

    VMware vRealize Automation Handbook Implementing Cloud Management in the Enterprise Environment 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索...

    Pro Machine Learning Algorithms Implementing Algorithms in Python

    Pro Machine Learning Algorithms: A Hands-On Approach to Implementing Algorithms in Python and R by V Kishore Ayyadevara Bridge the gap between a high-level understanding of how an algorithm works and...

    Implementing Useful Algorithms in C++ (Dmytro Kedyk).pdf

    《Implementing Useful Algorithms in C++》是Dmytro Kedyk撰写的一本关于C++编程中实现实用算法的书籍。本书旨在帮助读者深入理解如何在C++中有效地编写和优化算法,提高编程技能。 首先,书中提到算法设计的基本...

    Pro Machine Learning Algorithms Implementing Algorithms in Python epub

    Pro Machine Learning Algorithms: A Hands-On Approach to Implementing Algorithms in Python and R by V Kishore Ayyadevara Bridge the gap between a high-level understanding of how an algorithm works and...

    Implementing SOA : Total Architecture in Practice

    * Understanding the strategies and tradeoffs for implementing robust, secure, high-performance, and high-availability solutions * Understanding how to incorporate business process management (BPM) ...

    Implementing Useful Algorithms in C++ (PDF+Src+书评)

    Implementing Useful Algorithms in C++ (Dmytro Kedyk)是由Dmytro Kedyk撰写的一本专注于算法实现的编程书籍。本书以C++语言为载体,深入探讨了算法在实际编程中的应用,涵盖了从基础到高级的多种算法,包括排序、...

    Implementing Median Filters in FPGAs

    T he median filter is a popular image processing technique for removing salt and pepper (“shot”) noise from images. With this technique, the eight direct neigh- bors and center point of a sliding 3-...

    Implementing Stereoscopic 3D in Your Applications

    在这种技术下,显示器的刷新率将被cut in half,例如120 Hz的显示器将变成60 Hz per eye。这样,用户最终看到的是一个具有深度感的图像。 三、NVIDIA 3D Vision技术 NVIDIA 3D Vision是一种立体显示技术,包括硬件...

    Implementing Opacity Masks in Silverlight

    在本文中,我们将深入探讨如何在Silverlight中实现不透明度遮罩(Opacity Masks),这是一种强大的视觉效果技术,可以为你的应用程序添加丰富的图形层次和动态交互性。Silverlight,作为微软开发的一种轻量级框架,...

    Implementing Transaction Suspension in Spring

    在Spring框架中,事务管理是其核心特性之一,它提供了灵活且强大的数据访问和事务处理能力。本篇文章将深入探讨Spring的事务管理机制,特别是如何实现事务的挂起(Transaction Suspension),并展示如何利用Spring的...

    S4F61 Implementing Central Finance in SAP S/4HANA

    课程《S4F61 Implementing Central Finance in SAP S/4HANA》提供了一个为期5天的培训,旨在帮助参与者深入理解如何配置和使用SAP Central Finance解决方案。课程覆盖了从规划到部署的全过程,确保学员能够掌握实施...

    Implementing Digital Geolibraries in Undergraduate Education-UCSB

    《在本科教育中实施数字地学图书馆——以UCSB为例》 数字地学图书馆的实施在当前高等教育领域中扮演着越来越重要的角色。本文主要探讨了加利福尼亚大学洛杉矶分校(UCLA)和圣巴巴拉分校(UCSB)在本科生教育中如何...

    Implementing CIFS - pdf

    7. **Chapter 17: High Availability and Disaster Recovery**(ch17.pdf): 讨论了CIFS服务的高可用性和灾难恢复方案,包括冗余配置、故障切换以及备份和恢复策略。 8. **Chapter 13: CIFS Troubleshooting**...

Global site tag (gtag.js) - Google Analytics