`

mysql slave 备库延迟是怎么得到的

 
阅读更多

    在mysql的备库的监控中有一项很重要的指标:Seconds_Behind_Master,这个值是怎么得到的呢?下面从5.1.58的代码中分析一下:

    mysql的replication中有2个比较重要的class:Master_info(rpl_mi.h), Relay_log_info(rpl_rli.h),他们分别对应于master,info文件和slave.info文件;很显然,Master_info是io_thread需要的,Relay_log_info是sql_thread需要的。Master_info中有一个变量 clock_diff_with_master,这个值记录着mysql的主库和备库的时间差,可以理解为主备的主机时间差。clock_diff_with_master变量的定义如下:

  /*
     The difference in seconds between the clock of the master and the clock of
     the slave (second - first). It must be signed as it may be <0 or >0.
     clock_diff_with_master is computed when the I/O thread starts; for this the
     I/O thread does a SELECT UNIX_TIMESTAMP() on the master.
     "how late the slave is compared to the master" is computed like this:
     clock_of_slave - last_timestamp_executed_by_SQL_thread - clock_diff_with_master

  */
  long clock_diff_with_master;

     这个变量的注释直接提到了Seconds_Behind_Master的计算方法:clock_of_slave - last_timestamp_executed_by_SQL_thread - clock_diff_with_master。clock_of_slave是slave的当前时间--执行show slave status的当前时间。

     先看一下clock_diff_with_master的计算:(slave.cc)。执行”start slave;“/“start slave io_thread;”后,会执行start_slave_threads来启动io thread,io thread启动后首先做的就是获取主库的mysql版本和主库的当前时间(mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()"))),获取到主库的当前时间后,用备库的当前时间减去主库的时间,得到clock_diff_with_master。

int start_slave_threads(bool need_slave_mutex, bool wait_for_start,
                        Master_info* mi, const char* master_info_fname,
                        const char* slave_info_fname, int thread_mask)
{
...
  if (thread_mask & SLAVE_IO)
    error=start_slave_thread(handle_slave_io,lock_io,lock_cond_io,
                             cond_io,
                             &mi->slave_running, &mi->slave_run_id,
                             mi, 1); //high priority, to read the most possible
...
}
...
/**
  Slave IO thread entry point.

  @param arg Pointer to Master_info struct that holds information for
  the IO thread.

  @return Always 0.
*/
pthread_handler_t handle_slave_io(void *arg)
{
...
  thd_proc_info(thd, "Checking master version");
  ret= get_master_version_and_clock(mysql, mi);
...
}
...
static int get_master_version_and_clock(MYSQL* mysql, Master_info* mi)
{
...
  master_res= NULL;
  if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) &&
      (master_res= mysql_store_result(mysql)) &&
      (master_row= mysql_fetch_row(master_res)))
  {
    mi->clock_diff_with_master=
      (long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10));
  }
  else if (is_network_error(mysql_errno(mysql)))
  {
    mi->report(WARNING_LEVEL, mysql_errno(mysql),
               "Get master clock failed with error: %s", mysql_error(mysql));
    goto network_err;
  }
  else 
  {
    mi->clock_diff_with_master= 0; /* The "most sensible" value */
    sql_print_warning("\"SELECT UNIX_TIMESTAMP()\" failed on master, "
                      "do not trust column Seconds_Behind_Master of SHOW "
                      "SLAVE STATUS. Error: %s (%d)",
                      mysql_error(mysql), mysql_errno(mysql));
  }
...
}

     last_timestamp_executed_by_SQL_thread则是sql slave thread最近一次执行的event的时间戳(rpl.rli.cc中的last_event_start_time变量)。而这个变量的更新则是在stmt_done()函数中完成的(rpl.rli.cc),相关定义如下:

  /*
    Used by row-based replication to detect that it should not stop at
    this event, but give it a chance to send more events. The time
    where the last event inside a group started is stored here. If the
    variable is zero, we are not in a group (but may be in a
    transaction).
   */
  time_t last_event_start_time;

  /**
    Helper function to do after statement completion.

    This function is called from an event to complete the group by
    either stepping the group position, if the "statement" is not
    inside a transaction; or increase the event position, if the
    "statement" is inside a transaction.

    @param event_log_pos
    Master log position of the event. The position is recorded in the
    relay log info and used to produce information for <code>SHOW
    SLAVE STATUS</code>.

    @param event_creation_time
    Timestamp for the creation of the event on the master side. The
    time stamp is recorded in the relay log info and used to compute
    the <code>Seconds_behind_master</code> field.
  */
  void stmt_done(my_off_t event_log_pos,
                 time_t event_creation_time);

     具体的逻辑则是,sql thread启动后,读取relaylog(netxt_event()),apply & update pos(apply_event_and_update_pos()),update_pos的时候判断是否执行到了事务的结束位置,如果执行到了,则调用stmt_done(),stmt_done()会将last_master_timestamp更新为最近一次event的创建时间(event_creation_time)。因此如果在主备基本无延迟的时候,主库执行了一个大事务,你会发现备库延迟突然很大,然后又没了,"延迟跳跃"。

/**
  Slave SQL thread entry point.

  @param arg Pointer to Relay_log_info object that holds information
  for the SQL thread.

  @return Always 0.
*/
pthread_handler_t handle_slave_sql(void *arg)
{
...
    if (exec_relay_log_event(thd,rli))
...
}
...
static int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{
...
  Log_event * ev = next_event(rli);
...
    exec_res= apply_event_and_update_pos(ev, thd, rli);
...
}
...
int apply_event_and_update_pos(Log_event* ev, THD* thd, Relay_log_info* rli)
{
...
    exec_res= ev->apply_event(rli);
...
    int error= ev->update_pos(rli);
...
}
....log_event.cc
int
Rows_log_event::do_update_pos(Relay_log_info *rli)
{
...
  if (get_flags(STMT_END_F))
  {
    /*
      Indicate that a statement is finished.
      Step the group log position if we are not in a transaction,
      otherwise increase the event log position.
    */
    rli->stmt_done(log_pos, when);
   }
...
}
...rpl_rli.cc
void Relay_log_info::stmt_done(my_off_t event_master_log_pos,
                                  time_t event_creation_time)
{
...
  if ((sql_thd->options & OPTION_BEGIN) && opt_using_transactions)
    inc_event_relay_log_pos();
  else
  {
    inc_group_relay_log_pos(event_master_log_pos);
    flush_relay_log_info(this);
    /*
      Note that Rotate_log_event::do_apply_event() does not call this
      function, so there is no chance that a fake rotate event resets
      last_master_timestamp.  Note that we update without mutex
      (probably ok - except in some very rare cases, only consequence
      is that value may take some time to display in
      Seconds_Behind_Master - not critical).
    */
#ifndef DBUG_OFF
    if (!(event_creation_time == 0 && debug_not_change_ts_if_art_event > 0))
#else
      if (event_creation_time != 0)
#endif
        last_master_timestamp= event_creation_time;
  }
}

    另外一种可能会出现延迟跳跃的情况是:主库的网络有问题,io thread拉binlog的速度很慢,备库实际有很大的延迟,而sql thread执行的时候永远是hot log,这个时候会出现,备库一直有很大的延迟,突然一下延迟没了(=0),接下去延迟又变得很大。这种情况的原因为:sql thread读取relay log是在next_event(slave.cc)中完成的,而next_event中会判断当前的读取的relay log是否是“热的”--正在被io thread使用,如果是"热的",并且读取到当前relay log的最后一个event 或io thread没有再更新relay log的时候,会将last_master_time_stamp置为0,并new一个 Rotate_log_event。而如果在这个时候执行"show slave status",延迟就是0了。这个地方需要补充一点的是“show slave status”和前面提到的逻辑,唯一有争用的锁是&rli->data_lock,但是next_event在恢复last_master_time_stamp前就释放了这个锁:

...slave.cc...
static Log_event* next_event(Relay_log_info* rli)
{
...
  safe_mutex_assert_owner(&rli->data_lock);
...
    if (!cur_log->error) /* EOF */
    {
      /*
        On a hot log, EOF means that there are no more updates to
        process and we must block until I/O thread adds some and
        signals us to continue
      */
      if (hot_log)
      {
        time_t save_timestamp= rli->last_master_timestamp;
        rli->last_master_timestamp= 0;
...
        /*
          We can, and should release data_lock while we are waiting for
          update. If we do not, show slave status will block
        */
        pthread_mutex_unlock(&rli->data_lock);
...
        rli->last_master_timestamp= save_timestamp;
...
       }
...
     }
...
}
....
/**
  Execute a SHOW SLAVE STATUS statement.

  @param thd Pointer to THD object for the client thread executing the
  statement.

  @param mi Pointer to Master_info object for the IO thread.

  @retval FALSE success
  @retval TRUE failure
*/
bool show_master_info(THD* thd, Master_info* mi)
{
...
    pthread_mutex_lock(&mi->run_lock);
    protocol->store(mi->io_thd ? mi->io_thd->proc_info : "", &my_charset_bin);
    pthread_mutex_unlock(&mi->run_lock);

    pthread_mutex_lock(&mi->data_lock);
    pthread_mutex_lock(&mi->rli.data_lock);
    pthread_mutex_lock(&mi->err_lock);
    pthread_mutex_lock(&mi->rli.err_lock);
...
    /*
      Seconds_Behind_Master: if SQL thread is running and I/O thread is
      connected, we can compute it otherwise show NULL (i.e. unknown).
    */
    if ((mi->slave_running == MYSQL_SLAVE_RUN_CONNECT) &&
        mi->rli.slave_running)
    {
      long time_diff= ((long)(time(0) - mi->rli.last_master_timestamp)
                       - mi->clock_diff_with_master);
      /*
        Apparently on some systems time_diff can be <0. Here are possible
        reasons related to MySQL:
        - the master is itself a slave of another master whose time is ahead.
        - somebody used an explicit SET TIMESTAMP on the master.
        Possible reason related to granularity-to-second of time functions
        (nothing to do with MySQL), which can explain a value of -1:
        assume the master's and slave's time are perfectly synchronized, and
        that at slave's connection time, when the master's timestamp is read,
        it is at the very end of second 1, and (a very short time later) when
        the slave's timestamp is read it is at the very beginning of second
        2. Then the recorded value for master is 1 and the recorded value for
        slave is 2. At SHOW SLAVE STATUS time, assume that the difference
        between timestamp of slave and rli->last_master_timestamp is 0
        (i.e. they are in the same second), then we get 0-(2-1)=-1 as a result.
        This confuses users, so we don't go below 0: hence the max().

        last_master_timestamp == 0 (an "impossible" timestamp 1970) is a
        special marker to say "consider we have caught up".
      */
      protocol->store((longlong)(mi->rli.last_master_timestamp ?
                                 max(0, time_diff) : 0));
    }
    else
    {
      protocol->store_null();
    }
...
}

     到这里Seconds_Behind_Master的计算就结束了。总结起来就是Seconds_Behind_Master的计算有3部分组成,备库当前时间,sql thread最近一次执行的event的时间,主备系统的时间差。备库出现延迟跳跃,可能是大事务引起,也可能是网络问题。(ps:这2种情况,都被我遇到过:))

 

链接:

http://www.mysqlperformanceblog.com/2013/09/16/possible-reasons-when-mysql-replication-lag-is-flapping-between-0-and-xxxxx/

分享到:
评论

相关推荐

    记一次MySQL Slave库恢复实战记录

    从库(Slave)则负责读取主库的二进制日志,并在自己的数据库中重放这些变更,以保持数据的一致性。在从库的日常运行过程中,可能会遇到各种问题,导致复制进程中断。本文所涉及的是一次实际的从库恢复案例,为读者...

    Mysql-master-slave.rar_Master/Slave_slave mysql

    MySQL的主从复制是数据库高可用性和数据冗余的一种常见实现方式,它允许数据从一个服务器(主服务器)同步到一个或多个其他服务器(从服务器)。这种架构在处理大量读取操作、提供故障切换能力以及分布式系统中尤其...

    MySQL从库延迟故障.pdf

    MySQL从库延迟故障.pdf 本文档详细介绍了MySQL从库延迟故障的故障诊断和处理过程。该故障是指在MySQL主从复制架构中,从库出现延迟故障,导致查询操作延迟。该故障对数据的查询操作产生了不利的影响。 知识点1: ...

    26.备库为什么会延迟好几个小时?1

    【MySQL 备库延迟分析与并行复制机制】 在MySQL的主从复制中,备库延迟通常是由于备库处理主库日志的速度慢于主库生成日志的速度导致的。这种情况在高压力环境下尤为明显,可能导致备库长时间无法追上主库的进度。...

    mysql同步问题之Slave延迟很大优化方法

    一般而言,slave相对master延迟较大,其根本原因就是slave上的复制线程没办法真正做到并发。简单说,在master上是并发模式(以InnoDB引擎为主)完成事务提交的,而在slave上,复制线程只有一个sql thread用于binlog的...

    MySQL slave_net_timeout参数解决的一个集群问题案例

    总结来说,`slave_net_timeout`参数是MySQL主从复制中的关键配置,用于控制从库在没有接收到主库binlog事件后的重试间隔。正确设置此参数可以确保主从复制的高效和稳定,避免因网络问题导致的连接异常和数据同步延迟...

    windows下mysql主备双向复制与mycat负载均衡

    在搭建和使用MySQL主备复制以及Mycat的过程中,可能遇到同步延迟、数据不一致、故障转移等问题。此时需要检查配置文件、日志文件,分析网络状况和服务器性能,以排除故障。 以上知识点详细阐述了在Windows环境下,...

    mysql slave IO

    MySQL的Slave IO是数据库复制过程中的一个重要组成部分,它主要用于实现主从服务器之间的数据同步。在MySQL的主从架构中,主服务器负责处理客户端的写操作,而从服务器则通过复制主服务器的数据来保持与主服务器的...

    MySQL主从复制、搭建、状态检查、中断排查及备库重做 实战手册.pdf

    本文档将详细介绍MySQL主从复制的相关知识点,包括其原理、搭建步骤、状态检查、同步中断排查以及备库重做的方法。 1. MySQL主从复制概述 MySQL主从复制是基于二进制日志(binary log)的异步复制过程,主服务器...

    mysql数据库 slave复制异常问题解决办法.docx

    MySQL数据库的主从复制是数据库高可用性和负载均衡的一种常见实现方式,它允许数据从一个主服务器(Master)实时同步到一个或多个从服务器(Slave)。然而,在实际操作中,主从复制可能会遇到各种异常情况,比如在本...

    MySQL中slave监控的延迟情况分析

    在MySQL复制环境中,我们通常只根据 Seconds_Behind_Master 的值来判断SLAVE的延迟。这么做大部分情况下尚可接受,但并不够准确,而应该考虑更多因素。 首先,我们先看下SLAVE的状态: 代码如下:yejr@imysql.com [...

    25.MySQL是怎么保证高可用的?1

    1. **主备延迟**:主备延迟是指备库执行完一个事务的时间(T3)与主库完成该事务并写入binlog的时间(T1)之间的差值,通常通过`show slave status`命令中的`seconds_behind_master`参数来度量,精度为秒。 2. **同步...

    新版 MySQL DBA 高级视频 基于MySQL 5.7 MySQL 8.0版本.rar

    │ 6_MySQL复制状态和延迟复制.mp4 │ 7_MySQL半同步复制.mp4 │ 作业.docx │ ├─新版MySQL DBA综合实战班 第10天 │ │ 1_课后作业讲解.mp4 │ │ 2_MySQL传统复制手动切换和GTID复制原理及切换.mp4 │ │ 3_...

    MySQL中从库延迟状况排查的一则案例

    总结来说,MySQL从库的延迟排查涉及到对复制机制的理解,对慢查询日志、二进制日志的分析,以及对InnoDB存储引擎状态的深入洞察。通过一系列的工具和命令,我们能够逐步定位到问题所在,进而进行有针对性的优化和...

Global site tag (gtag.js) - Google Analytics