`
hideto
  • 浏览: 2682811 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

HPM Note2, MySQL Architecture

阅读更多
MySQL's Logical Architecture
Three layer:
1, Connection/thread handling
2, Query parsing, analysis, optimization, caching, build-in functions
3, Storage engine

MySQL server caches threads, so they don't need to be created and destroyed for each new conneciton
Client authentication is based on username, originating host, and password

MySQL parses queries to create an internal structure(parse tree), and then applies a variety of optimizations
These may include rewriting the query, determineing the order in which it will read tables, choosing which indexes to use, and so on
You can pass hints to optimizer through special keywords in the query, affecting its decision-making process

The optimizer does not really care what storage engine a particular table uses, but the storage engine does affect how the server optimizes query
Before even parsing the query, the sever consults the query cache, which can store only SELECT statements, along with their result sets

Concurrency Control
Systems that deal with concurrent read/write access typically implement a locking system that consists of two lock types
These locks are usually known as shared locks and exclusive locks, or read locks and write locks

Read locks on a resources are shared, or mutually nonblocking: many clients may read from a resource at the same time and not interfere with each other
Write locks, on the other hand, are exclusive--i.e., they block both read locks and other write locks--because the only safe policy is to have a single client writing to the resource at given time and to prevent all reads when a client is writing

Minimizing the amount of data that you lock at any one time lets changes to a given resource occur simultaneously, as long as they don't conflict with each other
A locking strategy is a compromise between lock overhead and data safety, and that compromise affects performance

MySQL's storage engines can implement their own locking policies and lock granularities

Table locks is the most basic locking strategy available in MySQL with the lowest overhead
Write locks have a higher priority than read locks, so a request for a write lock will advance to the front of the lock queue even if readers are already in the queue
Write locks can advance past read locks in the queue, but read locks cannot advance past write locks

Although storage engines can manage their own locks, MySQL itself also uses a variety of locks that are effectively table-level for various purposes
For instance, the server uses a table-level lock for statements such as ALTER TABLE, regardless of the storage engine

The locking style that offers the greatest concurrency and carries the greatest overhead is the use of row locks
Row-level locking is available in the InnoDB and Falcon storage engines
Row locks are implemented in the storage engine, not the server
The server is completely unaware of locks implemented in the storage engines, and the storage engines all implement locking in their own ways

Transactions
A transaction is a group of SQL queries that are treated atomically, as a single unit of work

Atomicity: A transaction must function as a single indivisible unit of work so that the entire transaction is either applied or rolled back
Consistency: The database should always move from one consistent state to the next
Isolation: The results of a transaction are usually invisible to other transactions until the transaction is complete
Durability: Once committed, a transaction's change are permanent

The SQL standard defines four isolation levels:
READ UNCOMMITTED: transactions can view the results of uncommitted transactions
READ COMMITTED: The default isolation level for most database system(but not MySQL), transactiosn will see only those changes made by transactions that were already committed, and its changes won't be visible to others until it has committed
REPEATABLE READ: MySQL's default transaction isolation level, it guarantees that any rows a transaction reads will "look the same" in subsequent reads within the same transaction
SERIALIZABLE: The highest level of isolation, solves the phantom read problem by forcing transactions to be ordered so that they can't possibly conflict

ANSI SQL isolation levels:
Isolation level    Dirty reads possible    Nonrepeatable reads possible    Phantom reads possible    Locking reads
READ UNCOMMITTED           Y                         Y                             Y                     N
READ COMMITTED             N                         Y                             Y                     N
REPEATABLE READ            N                         N                             Y                     N
SERIALIZABLE               N                         N                             N                     Y

A deaklock is when two or more transactions are mutually holding and requesting locks on the same resources, creating a cycle of dependencies
The more sophisticated systems, such as the InnoDB storage engine, will notice circular dependencies and return an error instantly
Deadlocks cannot be broken without rolling back one of the transactions, either partially or wholly
They are a fact of life in transactional systems, and your applications should be designed to handle them
Many applications can simply retry their transactions from the beginning

Transaction logging helps make transactions more efficient
Instead of updating the tables on disk each time a change occurs, the storage engine can change its in-memory copy of the data
This is very fast. The storage engine can then write a record of the change to the transaction log, which is on disk and therefore durable
This is also a relatively fast operation, because appending log events involves sequential I/O in one small area of the disk instead of random I/O in many places
Then, at some later time, a process can update the table on disk
Thus, most storage engines that use this technique(known as write-ahead logging) end up writing the changes to disk twice
(The PBXT storage engine cleverly avoids some write-ahead logging)
PBXT white paper

MySQL AB provides three transactional storage engines: InnoDB, NDB Cluster, and Falcon
Several thrid-party engines are also available; the best-know engines right now are solidDB and PBXT

MySQL operates in AUTOCOMMIT mode by default
This means that unless you've explicitly begun a transaction, it automatically executes each query in a separate transaction
Changing the value of AUTOCOMMIT has no effect on nontransactional tables, such as MyISAM or Memory tables, which essentially always operate in AUTOCOMMIT mode
Certain commands cause MySQL to commit the transaction before they execute
These are typically Data Definition Language commands that make significant changes, such as ALTER TABLE, LOCK TABLES
MySQL lets you set the isolation level using the SET TRANSACTION ISOLAITON LEVEL command, which takes effect when the next transaction starts

MySQL doesn't manage transactions at the server level, instead the underlying storage engines implement transactions themselves
This means you can't reliably mix different engines in a single transaction
MySQL also supports the LOCK TABLES and UNLOCK TABLES commands, which are implemented in the server, not in the storage engines
The interaction between LOCK TABLES and transactions is complex, and there are unexpected behaviors in some server versions
Therefore, we recommend that you never use LOCK TABLES unless you are in a transaction and AUTOCOMMIT is disables, no matter what storage engine you are using

Multiversion Concurrency Control
Most of MySQL's transactional storage engines, such as InnoDB, Falcon, and PBXT, don't use a simple row-locking mechanism
Instead, they use row-level locking in conjunction with a technique for increasing concurrency known as multiversion concurrency control(MVCC)
MVCC is not unique to MySQL: Oracle, PostgreSQL, and some other database systems use it too

InnoDB implements MVCC by storing with each row two additional, hidden values that record when the row was created and when it was expired(or deleted)
Rather than storing the actual times at which these events occurred, the row stores the system version number at the time each event occurred
This is a number that increments each time a transaction begins
Each transaction keeps its own record of the current system version, as of the time it began

MVCC works only with the READ COMMITTED and REPEATABLE READ isolation levels
READ UNCOMMITTED isn't MVCC-compatible because queries don't read the row version that's appropriate for their transaction version, they read the newest version, no matter what
SERIALIZABLE isn't MVCC-compatible because reads lock every row they return

Locking models and concurrency in MySQL using the defualt isolation level
Locking strategy        Concurrency        Overhead        Engines
Table level                Lowest           Lowest        MyISAM, Merge, Memory
Row level                  High             High          NDB Cluster
Row level with MVCC        Highest          Highest       InnoDB, Falcon, PBXT, solidDB

MySQL's Storage Engines
MySQL stores each database(also called a schema) as a subdirectory of its data directory in the underlying filesystem
When you create a table, MySQL stores the table definition in a .frm file with the same name as the table
Each storage engine stores the table's data and indexes differently, but the server itself handles the table definition

MyISAM
MySQL's default storage engine, provides good compromise between performance and useful features, such as full-text indexing, compression, and spatial(GIS) functions
MyISAM doesn't support transactions or row-level locks
MyISAM typically stores each table in two files: a data file and an index file
The two files bear .MYD and .MYI extensions respectively

MyISAM features
MyISAM locks entire tables, not rows
MySQL supports automatic checking and repairing of MyISAM tables
You can use the CHECK TABLE mytable and REPAIR TABLE mytable commands to check a table for errors and repair them
You can also use the myisamchk command-line tool to check and repair tables when the server is offline
You can create indexes on the first 500 character of BLOB and TEXT columns in MyISAM tables
MyISAM supports full-text indexes, which index individual words for complex search operations
MyISAM tables marked with the DELAY_KEY_WRITE create option don't write changed index data to disk at the end of a query
Instead MyISAM buffers the changes in the in-memory key buffer, it flushes index blocks to disk when it prunes the buffer or closes the table

InnoDB
InnoDB was designed for transaction processing-specifically, processing of many short-lived transactions that usually complete rather than being rolled back
InnoDB stores its data in a series of one or more data files that are collectively known as a tablespace
A tablespace is essentially a black box that InnoDB manages all by itself
In MySQL 4.1 and newer versions InnoDB can store each table's data and indexes in separate files
InnoDB uses MVCC to achieve high concurrency, and it implements all four SQL standard isolation levels
It defaults to the REPEATABLE READ isolation level, and it has a next-key locking strategy that prevents phantom reads in this isolation level
InnoDB tables are built on a clustered index, it provides very fast primary key lookups
Sencondary indexes  contain the primary key columns, you should strive for a small primary key if you have many indexes on a table
InnoDB doesn't compress its indexes
InnoDB loads data and creates indexes more slowly than MyISAM, any operation that changes an InnoDB table's structure will rebuild the entire table, including all the indexes
Besides its high-concurrency capabilities, InnoDB's next most popular feature is foreign key constraints
InnoDB also provides extremely fast lookups for queries that use a primary key
InnoDB has a variety of internal optimizations, include predictive read-ahead for prefetching data from disk, an adaptive hash index that automatically builds hash indexes in memory for very fast lookups, and an insert buffer to speed inserts

Main elements you should take into account to choose storage engines:
Transactions, Concurrency, Backups, Crash recovery, Special features
分享到:
评论
1 楼 hideto 2009-06-30  
看来对实时性要求比较高的应用都是采用logging这种方案啊
FW的AdServer是logging,之前师兄设计的一个网页游戏存储方案也是logging
因为logging是sequential的,没有random I/O的overhead,所以很快
结合memory cache就可以达到很好的实时性效果

可以参考下 protocolbuffers

相关推荐

    先楫半导体 HPM6700/6400 系列高性能微控制器用户手册

    HPM6700/6400 系列微控制器是基于 RISC-V 内核的 32 位高性能微控制器,适用于上海先楫半导体科技有限公司以下型号产品:HPM6750IVM1、HPM6750IAN1、HPM6730IVM1、HPM6730IAN1、HPM6450IVM1、HPM6450IAN1、HPM6430...

    上海先楫HPM6750采集OV5640摄像头显示在屏幕【支持HPM6750和HPM6450】

    上海先楫HPM6750是一款高性能的嵌入式微处理器,专为处理复杂的多媒体应用而设计。这款芯片在物联网(IoT)、智能硬件、工业自动化等领域有着广泛的应用。HPM6450是其系列中的另一款处理器,虽然可能在性能上有所区别...

    HPM1005驱动程序

    HPM1005驱动程序是专为HP(惠普)HPM1005型号打印机设计的重要软件组件。驱动程序在计算机系统和硬件设备之间起着桥梁的作用,它允许操作系统与打印机进行有效的通信,确保打印任务的顺利执行。...

    PICMG_HPM_1R1_0.pdf

    《HPM.1硬件平台管理:IPM控制器固件升级规范》 HPM.1(Hardware Platform Management)是PICMG(PCI工业计算机制造商集团)发布的一项标准,旨在规范服务器和其他计算平台的硬件平台管理,特别是关于IPM...

    HPM1210.INF

    HPM1210.INF

    上海先楫HPM6750驱动EMMC和SD卡【支持HPM6750和HPM6450】

    上海先楫HPM6750驱动EMMC和SD卡的开发是嵌入式系统设计中的关键环节,尤其对于基于HPM6750和HPM6450处理器的设备来说,高效的存储器管理是确保系统性能和稳定性的基础。这个代码工程提供了一个完整的解决方案,使...

    hpm1210sd.rar

    【标题】"hpm1210sd.rar"是一个压缩文件,通常用于在互联网上分享或存储多个相关文件。从这个名字来看,它可能是某个特定硬件设备、软件程序或者相关项目的一部分,具体而言,"HPM1210SD"可能代表一个型号、版本号...

    HPM1210SU.ent

    HPM1210SU.ent HPM1210SU.ent HPM1210SU.ent

    TDC 3000 系统 HPM组态手册

    《TDC 3000系统HPM组态手册》是针对HONEYWELL公司旗下的TDC 3000控制系统的一份详细指南,主要涵盖了该系统的模块配置和设定,尤其适用于石化行业的分布式控制系统(DCS)应用。在这个高度集成且复杂的自动化系统中...

    HPM1210SM.exe

    HPM1210SM.exe

    HPM1210SU.VER

    HPM1210SU.VER

    HPM.rar_ZIY_hpm

    标题中的"HPM.rar_ZIY_hpm"表明这是一个关于HPM(可能是Hercules Performance Monitor)的压缩文件,由用户ZIY创建或整理,主要用于HercWS(可能是指Hercules Web Services)。"ziy hpm"的标签进一步确认了这个文件...

    HPM1210PP.dll

    HPM1210PP.dll

    HPM1210GC.DLL

    HPM1210GC.DLL

    hpm1005win7驱动

    描述中的“win7可以使用”确认了HP LaserJet M1005打印机在Windows 7系统上是支持的,并且提到了“hpm10005win7的驱动”,这可能是描述中的一个小错误,应该是指“hpm1005win7”的驱动,即对应的是HP LaserJet M1005...

    HPM1210FPSD.DLL

    HPM1210FPSD.DLL

    HPM1005驱动

    适用于XP32位操作系统的HPm1005打印机驱动

    上海先楫HPM6750驱动LCD屏上显示内容【支持HPM6750和HPM6450】

    上海先楫HPM6750驱动LCD屏上显示内容的工程是专为HPM6750和HPM6450处理器设计的。这个项目的核心在于如何有效地利用这两款处理器的特性来控制和显示LCD屏幕上的信息。在嵌入式系统中,这种驱动程序的开发对于实现人...

    OCP_2014_firware_update_HPM.1.pdf

    HPM.1 Firmware Update Open Compute Summit - Compute Summit HPM.1 Firmware Update Engineering Workshop HPM.1 File Format File is not sent as a single image Each action is sent by itself to IPM ...

    上海先楫HPM6750实现CAN和CAN-FD通信【支持HPM6750和HPM6450】

    在本文中,我们将深入探讨如何使用上海先楫的HPM6750和HPM6450处理器实现CAN(Controller Area Network)和CAN-FD(Controller Area Network with Flexible Data-rate)通信。这两种通信协议在嵌入式系统,尤其是...

Global site tag (gtag.js) - Google Analytics