- 浏览: 195222 次
- 性别:
- 来自: 杭州
博客专栏
-
Percolator与分布...
浏览量:5674
文章分类
最新评论
-
heglase:
好牛逼 竟然解决了我别的问题
使用jdk工具tools.jar引发的问题 -
wqcva:
在使用这个类的时候workerId应该怎么传
java时间有序id生成 -
沙漠绿树:
增加虚拟节点解决数据均衡的问题。我有个疑问:1.使用虚拟节点后 ...
一致性hash的实现 -
BucketLi:
wangjian95 写道tddl.....?不是
java唯一ID生成 -
wangjian95:
tddl.....?
java唯一ID生成
这篇文章主要讲解了InnoDB的log作用原理以及如何和data写入相互配合,以提供高可用的存储服务,文章相对偏基础,英文也不算太晦涩,就不翻译了,这里主要收藏下。
原文出处:http://www.devarticles.com/c/a/MySQL/How-Logs-Work-On-MySQL-With-InnoDB-Tables/
When you change data with UPDATE, INSERT or DELETE queries you're changing the data in two places: the log buffer and the data buffers. Buffers are fixed-length, typically a multiple of 512 bytes. And they are in memory – InnoDB won't write them to disk... yet.
LOG BUFFER DATA BUFFER
================= ===============
= Log Record #1 = = Page Header =
= Log Record #2 = = Data Row =
= Log Record #3 = = Data Row =
= Log Record #4 = = Data Row =
================= ===============
For example, after "INSERT INTO Jobs VALUES (1,2,3)" the log buffer will have one new log record -- call it Log Record #5 -- containing a row identifier and the new contents. Meanwhile, the data buffer will have one new row, but it will also have a mark in the page header saying "the latest log record for this page is Log Record #5". In this example "#5" is the Log Sequence Number (LSN), and it's vital for scheduling later operations.
Some details about the data-change:
(a) An INSERT log record contains only new data, which is enough so that the procedure can be repeated on the same page if necessary. This is called a "redo" entry.
(b) The LSN isn't a field in the log record, instead it's an absolute address for a file and byte offset.
After InnoDB has changed the log buffer and the data buffers, it's all over but the disk writing. But that's where things get complex. There are several threads monitoring buffer activity, and three situations -- overflow, checkpoint, and commit -- that result in disk writes.
What Happens With Overflows?
Overflow is rare because InnoDB takes pro-active measures to prevent buffers from filling up (see "what happens with checkpoints" below). Still, let's discuss the two possible cases.
One: if the log buffer gets full, InnoDB writes the buffer at the "end" of the log. I've put the word "end" inside quotes because a log file, or more precisely a group of log files, looks like a snake swallowing its tail. If there's only room for four log records and we're writing #5, then it will have to go at the start of the file.
LOG FILE(S) BEFORE WRITING LOG RECORD #5
=================
= Log Record #1 =
= Log Record #2 =
= Log Record #3 =
= Log Record #4 =
=================
LOG FILE(S) AFTER WRITING LOG RECORD #5
=================
= Log Record #5 =
= Log Record #2 =
= Log Record #3 =
= Log Record #4 =
=================
There's no such thing as a log that grows forever. Even though InnoDB uses some compression tricks, the log files would get too big to fit on any disk drive. So InnoDB writes "in a circle" and that means it must overwrite old log records. This circular logging policy has implications which we'll come back to later.
Two: if the data buffers get full, InnoDB writes the least recently used buffer to the database -- but not too quickly! This is where the page header's LSN becomes useful. First, InnoDB checks whether it's greater than the LSN of the last log record in the log file. If it's greater, then InnoDB must write out log records first, until the log catches up with the data, and only then can it write the data. In other words data page writing cannot occur until writing of the corresponding log record has occurred. That is the "Write-Ahead Logging" principle which is common to all important DBMSs except InterBase.
What Happens With Checkpoints?
I said that InnoDB takes some pro-active measures against overflows, and the most important of these measures is checkpointing. There is a separate thread, or a combination of threads that are separate from the thread that changes the buffers. At fixed intervals the checkpointer will wake, look for buffer changes, and ensure that writes happen.
By my count, most DBMS's would write everything at this time so there are no buffers left which are changed but unwritten. To use the usual jargon, the DBMS would flush all "dirty" buffers with a "Sharp Checkpoint". But InnoDB only ensures that (a) log and data buffers aren't getting fuller than a fixed threshold point, (b) log writing stays ahead of data page writing, (c) there is no data buffer whose page-header LSN corresponds to a log record that's about to be overwritten. In the jargon, this means InnoDB is a "Fuzzy Checkpoint" aficionado.
At checkpoint time it's possible to write another log record which says, in effect: at this moment a checkpoint occurred, so it's certain that the database is up to date except for a few dirty pages, and here is a list of the dirty pages. This information can be useful during a recovery, so I'll mention it again later.
At COMMIT time, InnoDB will NOT write all dirty data pages to disk. I emphasize that because it's easy to think that committing changes means writing everything on a persistent medium. Well, the InnoDB folks are smarter than that. They realize that only the log records need writing. The writing of dirty data pages can happen at overflow or checkpoint time, because their contents are redundant. If the log survives a crash, remaking of data pages is possible using the information in the log records.
So InnoDB should only write the log. Or to be exact, InnoDB should write log records, until it has written all log records which apply to the transaction that's committing. Since all log writing is serial, that means InnoDB must write log records for other transactions too, but that's okay.
Here I must get critical, because that's not what InnoDB does, necessarily. If MySQL's my.cnf's innodb_flush_log_at_trx_commit switch is zero, then InnoDB will avoid log writing at commit time. This means that a successful COMMIT won't "guarantee" that all data changes have become persistent, which is what the ANSI/ISO standard demands. The persistence guarantee applies only for checkpoints.
Anyway, you can set innodb_flush_log_at_trx_commit to one. In that case all's well, InnoDB will write the log. Also InnoDB will flush.
I'd better explain what flushing is, eh? Usually it's enough to merely write, but all modern operating systems will cache writes for efficiency reasons. To get the "guarantee" InnoDB must insist to the operating system "I really mean write, I want that disk write head on a platter, don't return till the physical operation is complete." This means that on a Windows system InnoDB calls the Windows-API function FlushFilBuffers, a call meaning "flush the cache." Here InnoDB parts company with Microsoft: SQL Server 2000 would use a "write through" option during the write, rather than flushing after the write.
Recovery
We come now to the point that makes all the logging trouble worthwhile: if a crash happens, you can recover your data.
For a crash that didn't melt the disk drive, recovery is automatic. InnoDB reads the last checkpoint log record, sees if the "dirty pages" were written before the crash, and (if they weren't) reads the log records which affected the page and applies them. This is called "rolling forward" and it's easy for two reasons: (1) because the LSN is there so InnoDB just has to compare numbers to get into synch, (2) because I left out a few details.
Fine. Now, what about a crash that did melt the disk drive? Then the recovery scenario depends on your preparation.
Scenario one: the log is gone. Well, you should have prepared by having a log copy on a separate drive. InnoDB has no explicit option but there are operating-system-specific ways.
Scenario two: the database is gone and the log is overwritten. Well, you should have anticipated that with circular logging, log record #5 will overwrite log record #1. Remember? Therefore if you didn't take a backup after the writing of log record #1, you've lost data.
Scenario three: the database is gone and the log is okay. In that case, congratulations. You just have to restore your last backup copy, and roll the whole log forward. There would be complications if you had to back up the log itself several times since the last full backup ("archive logging"), but I've assumed that option is currently inoperative. By the way, I am not discussing the role of MySQL's binlog. It's essential to the process but isn't part of InnoDB, so out of scope.
With an understanding of InnoDB logging, you know some things need watching. In no particular order:
• use large log files so overwriting won't happen since backups
• keep log files and data files on separate disk drives
• make sure innodb_flush_log_at_trx_commit is set right
Hopefully this article has helped shed some light on the seriously under-documented features of MySQL InnoDB table logging. Remember to use the forum link below to discuss this article if you have any questions or comments
原文出处:http://www.devarticles.com/c/a/MySQL/How-Logs-Work-On-MySQL-With-InnoDB-Tables/
When you change data with UPDATE, INSERT or DELETE queries you're changing the data in two places: the log buffer and the data buffers. Buffers are fixed-length, typically a multiple of 512 bytes. And they are in memory – InnoDB won't write them to disk... yet.
LOG BUFFER DATA BUFFER
================= ===============
= Log Record #1 = = Page Header =
= Log Record #2 = = Data Row =
= Log Record #3 = = Data Row =
= Log Record #4 = = Data Row =
================= ===============
For example, after "INSERT INTO Jobs VALUES (1,2,3)" the log buffer will have one new log record -- call it Log Record #5 -- containing a row identifier and the new contents. Meanwhile, the data buffer will have one new row, but it will also have a mark in the page header saying "the latest log record for this page is Log Record #5". In this example "#5" is the Log Sequence Number (LSN), and it's vital for scheduling later operations.
Some details about the data-change:
(a) An INSERT log record contains only new data, which is enough so that the procedure can be repeated on the same page if necessary. This is called a "redo" entry.
(b) The LSN isn't a field in the log record, instead it's an absolute address for a file and byte offset.
After InnoDB has changed the log buffer and the data buffers, it's all over but the disk writing. But that's where things get complex. There are several threads monitoring buffer activity, and three situations -- overflow, checkpoint, and commit -- that result in disk writes.
What Happens With Overflows?
Overflow is rare because InnoDB takes pro-active measures to prevent buffers from filling up (see "what happens with checkpoints" below). Still, let's discuss the two possible cases.
One: if the log buffer gets full, InnoDB writes the buffer at the "end" of the log. I've put the word "end" inside quotes because a log file, or more precisely a group of log files, looks like a snake swallowing its tail. If there's only room for four log records and we're writing #5, then it will have to go at the start of the file.
LOG FILE(S) BEFORE WRITING LOG RECORD #5
=================
= Log Record #1 =
= Log Record #2 =
= Log Record #3 =
= Log Record #4 =
=================
LOG FILE(S) AFTER WRITING LOG RECORD #5
=================
= Log Record #5 =
= Log Record #2 =
= Log Record #3 =
= Log Record #4 =
=================
There's no such thing as a log that grows forever. Even though InnoDB uses some compression tricks, the log files would get too big to fit on any disk drive. So InnoDB writes "in a circle" and that means it must overwrite old log records. This circular logging policy has implications which we'll come back to later.
Two: if the data buffers get full, InnoDB writes the least recently used buffer to the database -- but not too quickly! This is where the page header's LSN becomes useful. First, InnoDB checks whether it's greater than the LSN of the last log record in the log file. If it's greater, then InnoDB must write out log records first, until the log catches up with the data, and only then can it write the data. In other words data page writing cannot occur until writing of the corresponding log record has occurred. That is the "Write-Ahead Logging" principle which is common to all important DBMSs except InterBase.
What Happens With Checkpoints?
I said that InnoDB takes some pro-active measures against overflows, and the most important of these measures is checkpointing. There is a separate thread, or a combination of threads that are separate from the thread that changes the buffers. At fixed intervals the checkpointer will wake, look for buffer changes, and ensure that writes happen.
By my count, most DBMS's would write everything at this time so there are no buffers left which are changed but unwritten. To use the usual jargon, the DBMS would flush all "dirty" buffers with a "Sharp Checkpoint". But InnoDB only ensures that (a) log and data buffers aren't getting fuller than a fixed threshold point, (b) log writing stays ahead of data page writing, (c) there is no data buffer whose page-header LSN corresponds to a log record that's about to be overwritten. In the jargon, this means InnoDB is a "Fuzzy Checkpoint" aficionado.
At checkpoint time it's possible to write another log record which says, in effect: at this moment a checkpoint occurred, so it's certain that the database is up to date except for a few dirty pages, and here is a list of the dirty pages. This information can be useful during a recovery, so I'll mention it again later.
At COMMIT time, InnoDB will NOT write all dirty data pages to disk. I emphasize that because it's easy to think that committing changes means writing everything on a persistent medium. Well, the InnoDB folks are smarter than that. They realize that only the log records need writing. The writing of dirty data pages can happen at overflow or checkpoint time, because their contents are redundant. If the log survives a crash, remaking of data pages is possible using the information in the log records.
So InnoDB should only write the log. Or to be exact, InnoDB should write log records, until it has written all log records which apply to the transaction that's committing. Since all log writing is serial, that means InnoDB must write log records for other transactions too, but that's okay.
Here I must get critical, because that's not what InnoDB does, necessarily. If MySQL's my.cnf's innodb_flush_log_at_trx_commit switch is zero, then InnoDB will avoid log writing at commit time. This means that a successful COMMIT won't "guarantee" that all data changes have become persistent, which is what the ANSI/ISO standard demands. The persistence guarantee applies only for checkpoints.
Anyway, you can set innodb_flush_log_at_trx_commit to one. In that case all's well, InnoDB will write the log. Also InnoDB will flush.
I'd better explain what flushing is, eh? Usually it's enough to merely write, but all modern operating systems will cache writes for efficiency reasons. To get the "guarantee" InnoDB must insist to the operating system "I really mean write, I want that disk write head on a platter, don't return till the physical operation is complete." This means that on a Windows system InnoDB calls the Windows-API function FlushFilBuffers, a call meaning "flush the cache." Here InnoDB parts company with Microsoft: SQL Server 2000 would use a "write through" option during the write, rather than flushing after the write.
Recovery
We come now to the point that makes all the logging trouble worthwhile: if a crash happens, you can recover your data.
For a crash that didn't melt the disk drive, recovery is automatic. InnoDB reads the last checkpoint log record, sees if the "dirty pages" were written before the crash, and (if they weren't) reads the log records which affected the page and applies them. This is called "rolling forward" and it's easy for two reasons: (1) because the LSN is there so InnoDB just has to compare numbers to get into synch, (2) because I left out a few details.
Fine. Now, what about a crash that did melt the disk drive? Then the recovery scenario depends on your preparation.
Scenario one: the log is gone. Well, you should have prepared by having a log copy on a separate drive. InnoDB has no explicit option but there are operating-system-specific ways.
Scenario two: the database is gone and the log is overwritten. Well, you should have anticipated that with circular logging, log record #5 will overwrite log record #1. Remember? Therefore if you didn't take a backup after the writing of log record #1, you've lost data.
Scenario three: the database is gone and the log is okay. In that case, congratulations. You just have to restore your last backup copy, and roll the whole log forward. There would be complications if you had to back up the log itself several times since the last full backup ("archive logging"), but I've assumed that option is currently inoperative. By the way, I am not discussing the role of MySQL's binlog. It's essential to the process but isn't part of InnoDB, so out of scope.
With an understanding of InnoDB logging, you know some things need watching. In no particular order:
• use large log files so overwriting won't happen since backups
• keep log files and data files on separate disk drives
• make sure innodb_flush_log_at_trx_commit is set right
Hopefully this article has helped shed some light on the seriously under-documented features of MySQL InnoDB table logging. Remember to use the forum link below to discuss this article if you have any questions or comments
发表评论
-
Mac 下在 xcode 中 debug MySQL 8.0
2019-05-16 20:28 895专业 MySQL 内核开发选手一般会选用 linux gdb ... -
MySQL 8.0 root密码忘了怎么办
2019-05-16 20:02 1725普通账号密码忘了,可以通过 root 账号重置,但是如果 ro ... -
MySQL 8.0 histogram statistics
2018-12-10 12:00 760原文: https://mysqlserverteam.com ... -
mysql 5.7默认root密码重设
2017-10-23 12:36 4801. mysql.server stop 2. xxx/mys ... -
SQL
2017-09-05 17:01 0SELECT [DISTINCT] ... -
MySQL5.Xbinlog简略格式
2016-08-11 23:32 414每种event的详细情况参见 http://dev.mysql ... -
mysqldump命令batch insert分段
2016-08-03 23:05 1114默认mysqldump命令会将指定库的某个表数据变成一整个ba ... -
分布式数据库推广文章系列
2016-01-04 11:30 0http://www.csdn.net/article/201 ... -
事务文章
2015-12-31 21:57 0http://my.oschina.net/huangyong ... -
一些文章
2015-09-04 14:38 0http://www.biaodianfu.com/herme ... -
Innodb中的锁机制(转)
2012-08-31 10:20 1015原文地址:http://www.mysqlops.com/20 ... -
Innodb中的锁机制
2012-05-20 21:27 0基本概念 - 锁的基本类型 A. 共享锁(Shared L ... -
MySQL源码分析(6):Innodb文件格式及压缩配置
2012-02-04 19:41 3364Innodb文件格式及压缩配置 Innodb文件格式代表了数据 ... -
MySQL源码分析(5):Innodb缓存系统(转)
2012-02-04 19:40 2231原文地址:http://software.in ... -
MySQL源码分析(4):InnoDB主要数据结构及调用流程(转)
2012-02-03 18:29 2367原文地址:http://software.intel.com/ ... -
MySQL源码分析(3):配置文件详解(转)
2012-02-03 18:28 1352原文地址:http://software.in ... -
MySQL源码分析(2):Mysql中的内存分配相关(转)
2012-02-03 18:24 1454原文地址:http://software.in ... -
MySQL源码分析(0):编译安装及调试(转)
2012-02-03 18:23 1713原文地址:http://software.intel.com/ ... -
MySQL源码分析(1):主要模块及数据流(转)
2012-02-03 18:24 5110原文地址:http://software.in ... -
C&C++入门学习-automake使用
2012-01-12 14:42 1825这两天一直在捣腾C&C++在linux上的工程编译和打 ...
相关推荐
### MySQL备份InnoDB类型的表或数据库 在进行MySQL数据库的备份操作时,特别是针对InnoDB存储引擎的表或数据库,采用正确的方法对于确保数据的一致性和完整性至关重要。本篇文章将详细解析如何通过`mysqldump`命令...
《How Tomcat Work》第一章主要围绕Tomcat服务器的实例应用展开,旨在帮助读者理解并实践Tomcat在实际开发中的工作原理。Tomcat是一款开源的、轻量级的应用服务器,主要用于运行Java Servlet和JavaServer Pages(JSP...
If you've already worked with MySQL before and are looking to migrate your application to MySQL 8, this book will also show you how to do that. The book also contains recipes on efficient MySQL ...
《How Tomcat Work 第二章 实例应用》深入解析 在深入探讨Tomcat工作原理之前,我们先要了解什么是Tomcat。Apache Tomcat是一款开源的Java Servlet容器,它实现了Java Servlet和JavaServer Pages (JSP) 规范,是...
8. **日志分析**:当MySQL服务无法启动时,查看MySQL的日志文件(通常在`/var/log/mysql/`或`C:\ProgramData\MySQL\ServerVersion\logs`目录下)可以帮助定位具体问题。错误日志会提供关于为什么InnoDB无法启用的...
Chapter 1, MySQL 8 - Installing and Upgrading, describes how to install MySQL 8 on different flavors of Linux, upgrade to MySQL 8 from previous stable releases, and also downgrade from MySQL 8. ...
Innodb_logs_ebook.pdf innodb_performance_optimization_final.pdf mysql_high_availability.pdf Mysql_logs_ebook.pdf mysql_performance_schema.pdf mysql_performance_tuning.pdf mysql_server_memory_usage_...
- **基本描述**:InnoDB是MySQL默认的事务型存储引擎,特别适用于需要事务处理的应用场景,具有自动崩溃恢复的能力。 - **架构图解**: - **内存结构**:包括缓冲池(Buffer Pool)、写缓冲区(Change Buffer)、...
MySQL的InnoDB存储引擎在处理数据删除时,采取了一种高效但可能会导致磁盘空间占用过高的策略。当用户删除数据行时,InnoDB并不会立即回收这些被删除的数据所占用的空间,而是简单地将这些记录标记为已删除。这种...
how to collect logs from your applications, as well as Kubernetes with Elasticsearch, Fluentd, and Kibana stack. Ensuring a service is up and healthy is one of the major responsibilities of DevOps. ...
MySQL 5.6 的基本优化配置是确保数据库高效运行的关键步骤。这涉及到调整配置文件 `my.cnf`,以优化内存使用、日志记录、事务处理速度和数据完整性。以下是一些重要的配置选项及其作用: 1. **InnoDB Buffer Pool**...
How to enable and view traffic logs in the J-Web_GUI on SRX devices
MySQL是世界上最流行的关系型数据库管理系统之一,其配置文件(通常为my.cnf或my.ini)是控制MySQL服务器行为的关键文件。在本篇中,我们将深入探讨如何修改MySQL配置文件以优化性能和解决特定问题。 首先,我们...
本文将总结 MySQL 性能调优参数,涵盖 Innodb 相关参数、连接参数、缓存参数、日志参数等多方面,帮助读者快速了解和掌握 MySQL 性能调优的方法。 Innodb 相关参数 1. Innodb_open_files:这个参数限制 Innodb 能...
- **崩溃恢复**:InnoDB有内置的崩溃恢复机制,通过redo logs和undo logs保证数据在系统故障后的安全性。 2. **MySQL语法优化** - **查询优化**:避免全表扫描,使用索引提高查询效率。合理设计索引,考虑复合...