`

fsync() Across Platforms

 
阅读更多

使用mongoDB时,默认的数据cache flush就是使用的fsync,这里找了一篇好文章故分享下,

 

转自:http://www.humboldt.co.uk/2009/03/fsync-across-platforms.html

 

fsync() Across Platforms

When an application writes a file, the data does not become permanent immediately. The write operation first moves the data into the operating system cache in RAM, where it is vulnerable to system crashes and loss of power. The second step is the transfer to the hard disk, which normally has write caching enabled. The disk acknowledges the data straight away, but keeps it in the disk write cache which is still volatile memory. The data is now safe from system crash1, but is not safe from loss of power. On a modern disk, this may be 16MB or more of data in unknown state.

As performance enhancements in ext4 have made committing data to disk a contentious issue, I’ve written a note on how different platforms handle data consistency.

 

Delayed Allocation

The root of the latest problem is an optimisation called delayed allocation. In delayed allocation, the file system does not decide where on the disk to save the file until it is necessary to transfer the data to the disk. Linux users have become accustomed to the ordered data mode of ext3, where file data is written to disk before changes to metadata2. Ordered data mode only writes out data when it knows the destination on disk, so with delayed allocation the metadata will go to disk first. If an application creates a file before a system crash, the file may exist after the crash, but with zero length. This caused user complaints when implemented in XFS, and again when implemented in ext4.

ext4 has implemented a workaround for the common case of creating a new file with a temporary name, then renaming it to its final name. This produces an approximation to the ext3 behaviour by allocating blocks when a file is renamed.

The Platforms

Apple

Apple implements delayed allocation in HFS+. When the application calls fsync() on a file HFS+ allocates disk blocks for the file data and transfers that data to the disk, but fsync() does not wait for the disk to complete writing the blocks from its cache. A complete flush to disk requires the F_FULLSYNC operation of fcntl().

Reports of zero length files after crashes are rare on Mac OS X, suggesting that system applications are well behaved and the window of opportunity for corruption is short. It is advisable to call fsync() for safety here.

Microsoft

The allocation strategy of NTFS is not visible externally, but experiments suggest that it does not implement delayed allocation. Applications can open files with the FILE_FLAG_WRITE_THROUGH flag, which causes all writes to be sent directly to the disk. The FlushFileBuffers()call will ensure that data from a file is written to the disk, then flushed from the disk’s write cache and committed.

Windows Vista and Server 2008 introduce a new mechanism: Transactional NTFS. This allows applications to perform database style transactions in the filesystem, ensuring that a set of file operations either completes or fails entirely.

Linux

The ext3 file system allocates disk blocks immediately on write. When combined with the ordered data mode this ensures that application data is written consistently. Unfortunately fsync() on ext3 has developed a reputation as an expensive operation, so developers avoid it. fsync() on ext3 writes all file data to the disk, and waits for the disk to commit the data from its write cache, but only if the file metadata has changedand the file system is not mapped via LVM3.

fsync() on ext4 allocates disk blocks for the file data, then writes the data to disk and waits for the disk to commit the data from its write cache, with the same limitations as ext3.

Unfortunately Linux does not offer an equivalent of F_FULLSYNC or FlushFileBuffers() unless the hard disk write cache is disabled.

Summary

The table below shows how to achieve different levels of consistency on recent versions of each platform covered above.

  Mac OS X Windows Linux
Write to disk without cache flush. fsync() FILE_FLAG_WRITE_THROUGH fsync()
Write to disk and wait for cache flush. F_FULLSYNC FlushFileBuffers() None4
Transactions None Transactional NTFS None
  1. Ignoring some worst case scenarios. 
  2. Metadata is data about the file, such as timestamps and permissions. On most Linux filesystems the file name is not part of the metadata, as the file may have multiple hard links. 
  3. This should be fixed in kernel 2.6.29 for simple cases. 
  4. If the hard disk write cache is disabled, fsync() on ext3 and ext4 will provide a complete sync to disk. 
分享到:
评论

相关推荐

    文件夹同步软件fsync中文注册版

    文件夹同步软件FSync中文注册版是一款高效且实用的工具,专为用户提供了方便快捷的文件和文件夹比较以及同步功能。在IT行业中,文件同步是至关重要的,尤其是在多设备协同工作、数据备份和恢复等领域。FSync软件以其...

    Linux内核驱动fsync机制实现图解.docx

    Linux内核驱动fsync机制实现图解 Linux内核驱动fsync机制实现图解可以分为四个方面:同步阻塞I/O、同步非阻塞I/O、异步阻塞I/O和异步非阻塞I/O。 同步阻塞I/O:应用程序显式地通过函数访问数据,在此函数返回时就...

    Linux系统调用fsync函数详解.docx

    Linux 系统调用 fsync 函数详解 fsync 函数是 Linux 系统调用中的一种同步函数,用于将内存中所有已修改的文件数据同步到储存设备中。其主要功能是将内核缓存中的数据写入到磁盘上,以确保文件系统的一致性。 ...

    函数sync、fsync与fdatasync的总结整理(必看篇)

    在IT领域,尤其是在操作系统和文件系统管理中,`sync`、`fsync`和`fdatasync`是三个至关重要的函数,它们用于确保数据的安全写入和系统一致性。以下是这三个函数的详细说明: 1. **sync函数**: `sync`函数的作用...

    node.js中的fs.fsync方法使用说明

    在Node.js中,`fs.fsync`方法是一个用于同步磁盘缓存的重要工具。它确保文件系统缓冲区的所有更改都写入磁盘,提供了一种可靠的方式来保存数据,避免因程序退出或系统崩溃而丢失信息。`fs.fsync`方法在处理对文件...

    fsync-encode-decode:用于特定数据突发格式的软件调制解调器

    fsync-编码-解码用于特定数据突发格式的软件调制解调器该库支持以 1200 和 2400 bps 对 Kenwood 的 FleetSync 和 FleetSync II 格式进行解码(编码将在以后的版本中添加)。 (其各自所有者的所有商标财产) GPL ...

    FSync:用于工作区之间文件同步的 Sublime Text 3 插件

    使用Package Control ,在下面添加存储库(Preferences > Package Control > Add Repository)并正常安装FSync包。 https://github.com/weverss/FSync 如果您没有安装 Package Controll,请参阅进行安装。

    libeatmydata:libeatmydata-因为fsync()应该是no-op

    fsync()变为NO-OP,O_SYNC被删除等。 这个想法是在测试中使用,以在不需要真正耐用性的情况下获得更快的测试运行速度。 不要在关心软件存储内容的软件上使用libeatmydata。 由于某种原因,它被称为lib EAT-MY-...

    fsync:FTP文件同步管理器,基于带有FTPimp的NodeJS构建

    FTP同步 基于浏览器的FTP文件同步管理器,使用NodeJS和FTPimp构建。 设置 将config/config.sample.js复制到config/config.js并根据需要更改config/config.js值 将config/ftp.sample.js复制到config/ftp.js并根据需要...

    9286硬件设计原理图已验证

    9286硬件设计原理图的验证涉及到一系列复杂的电子元器件和电路布局,这些内容主要集中在电源管理、信号调理、接口连接以及芯片配置等方面。在分析这个设计时,我们可以从中提取出以下几个关键知识点: ...

    ib_user_sa.rar_V2

    描述中提到"ext3fs fsync primitive for Linux v2.13.6",这揭示了重点知识在于Linux内核的ext3文件系统的同步(fsync)操作。fsync是一个关键的系统调用,用于确保文件系统的数据已经从缓存写入到持久存储设备,以...

    禁止使用ping命令

    标题中的“禁止使用ping命令”指的是在计算机网络环境中,通过组策略或安全设置来阻止用户使用ping这个网络诊断工具。ping命令是TCP/IP协议的一部分,主要用于检查网络连接的可达性和延迟,但有时出于安全考虑或者...

    MongoDBFsync锁与数据恢复.pdf

    MongoDB 提供了多种备份策略,其中之一就是 fsync+lock 方式,这种方法能够确保在备份过程中数据的实时性和一致性。 fsync 是 MongoDB 中的一个命令,它的作用是强制数据库将所有在内存缓冲区中的数据立即写入到...

    Linux内核源代码导读-陈香兰-ext2文件系统

    Linux内核源代码导读-陈香兰-中国科学技术大学-ext2文件系统

    23.MySQL是怎么保证数据不丢的?1

    write操作将日志写入操作系统页缓存,而fsync则将数据从页缓存同步到磁盘,确保数据持久化。参数`sync_binlog`用于控制何时进行fsync操作: 1. 当`sync_binlog=0`时,仅执行write,不执行fsync,性能较高但风险较大...

    linux笔记相关vim命令

    在Linux操作系统中,Vim(Vi Improved)是一个强大的文本编辑器,被广泛用于编写代码、配置文件和其他文本操作。这份“Linux笔记相关vim命令”涵盖了使用Vim进行文本编辑的基本操作和高级技巧,旨在帮助用户更高效地...

Global site tag (gtag.js) - Google Analytics