`
zhangxiong0301
  • 浏览: 359031 次
社区版块
存档分类
最新评论

zero-copy

    博客分类:
  • JAVA
阅读更多

Many Web applications serve a significant amount of static content, which amounts to reading data off of a disk and writing the exact same data back to the response socket. This activity might appear to require relatively little CPU activity, but it's somewhat inefficient: the kernel reads the data off of disk and pushes it across the kernel-user boundary to the application, and then the application pushes it back across the kernel-user boundary to be written out to the socket. In effect, the application serves as an inefficient intermediary that gets the data from the disk file to the socket.

Each time data traverses the user-kernel boundary, it must be copied, which consumes CPU cycles and memory bandwidth. Fortunately, you can eliminate these copies through a technique called — appropriately enough — zero copy. Applications that use zero copy request that the kernel copy the data directly from the disk file to the socket, without going through the application. Zero copy greatly improves application performance and reduces the number of context switches between kernel and user mode.

The Java class libraries support zero copy on Linux and UNIX systems through the transferTo() method injava.nio.channels.FileChannel. You can use the transferTo() method to transfer bytes directly from the channel on which it is invoked to another writable byte channel, without requiring data to flow through the application. This article first demonstrates the overhead incurred by simple file transfer done through traditional copy semantics, then shows how the zero-copy technique using transferTo() achieves better performance.

Date transfer: The traditional approach

Consider the scenario of reading from a file and transferring the data to another program over the network. (This scenario describes the behavior of many server applications, including Web applications serving static content, FTP servers, mail servers, and so on.) The core of the operation is in the two calls in Listing 1 (see Download for a link to the complete sample code):

Listing 1. Copying bytes from a file to a socket
File.read(fileDesc, buf, len);
Socket.send(socket, buf, len);

Although Listing 1 is conceptually simple, internally, the copy operation requires four context switches between user mode and kernel mode, and the data is copied four times before the operation is complete. Figure 1 shows how data is moved internally from the file to the socket:

Figure 1. Traditional data copying approach

Traditional data copying approach

Figure 2 shows the context switching:

Figure 2. Traditional context switches

Traditional context switches

The steps involved are:

  1. The read() call causes a context switch (see Figure 2) from user mode to kernel mode. Internally a sys_read() (or equivalent) is issued to read the data from the file. The first copy (see Figure 1) is performed by the direct memory access (DMA) engine, which reads file contents from the disk and stores them into a kernel address space buffer.
  2. The requested amount of data is copied from the read buffer into the user buffer, and the read() call returns. The return from the call causes another context switch from kernel back to user mode. Now the data is stored in the user address space buffer.
  3. The send() socket call causes a context switch from user mode to kernel mode. A third copy is performed to put the data into a kernel address space buffer again. This time, though, the data is put into a different buffer, one that is associated with the destination socket.
  4. The send() system call returns, creating the fourth context switch. Independently and asynchronously, a fourth copy happens as the DMA engine passes the data from the kernel buffer to the protocol engine.

Use of the intermediate kernel buffer (rather than a direct transfer of the data into the user buffer) might seem inefficient. But intermediate kernel buffers were introduced into the process to improve performance. Using the intermediate buffer on the read side allows the kernel buffer to act as a "readahead cache" when the application hasn't asked for as much data as the kernel buffer holds. This significantly improves performance when the requested data amount is less than the kernel buffer size. The intermediate buffer on the write side allows the write to complete asynchronously.

Unfortunately, this approach itself can become a performance bottleneck if the size of the data requested is considerably larger than the kernel buffer size. The data gets copied multiple times among the disk, kernel buffer, and user buffer before it is finally delivered to the application.

Zero copy improves performance by eliminating these redundant data copies.

 

Data transfer: The zero-copy approach

If you re-examine the traditional scenario, you'll notice that the second and third data copies are not actually required. The application does nothing other than cache the data and transfer it back to the socket buffer. Instead, the data could be transferred directly from the read buffer to the socket buffer. The transferTo() method lets you do exactly this. Listing 2 shows the method signature of transferTo():

Listing 2. The transferTo() method
public void transferTo(long position, long count, WritableByteChannel target);

The transferTo() method transfers data from the file channel to the given writable byte channel. Internally, it depends on the underlying operating system's support for zero copy; in UNIX and various flavors of Linux, this call is routed to the sendfile() system call, shown in Listing 3, which transfers data from one file descriptor to another:

Listing 3. The sendfile() system call
#include <sys/socket.h>
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);

The action of the file.read() and socket.send() calls in Listing 1 can be replaced by a single transferTo() call, as shown in Listing 4:

Listing 4. Using transferTo() to copy data from a disk file to a socket
transferTo(position, count, writableChannel);

Figure 3 shows the data path when the transferTo() method is used:

Figure 3. Data copy with transferTo()

Data copy with transferTo()

Figure 4 shows the context switches when the transferTo() method is used:

Figure 4. Context switching with transferTo()

Context switching when using transferTo()

The steps taken when you use transferTo() as in Listing 4 are:

  1. The transferTo() method causes the file contents to be copied into a read buffer by the DMA engine. Then the data is copied by the kernel into the kernel buffer associated with the output socket.
  2. The third copy happens as the DMA engine passes the data from the kernel socket buffers to the protocol engine.

This is an improvement: we've reduced the number of context switches from four to two and reduced the number of data copies from four to three (only one of which involves the CPU). But this does not yet get us to our goal of zero copy. We can further reduce the data duplication done by the kernel if the underlying network interface card supports gather operations. In Linux kernels 2.4 and later, the socket buffer descriptor was modified to accommodate this requirement. This approach not only reduces multiple context switches but also eliminates the duplicated data copies that require CPU involvement. The user-side usage still remains the same, but the intrinsics have changed:

  1. The transferTo() method causes the file contents to be copied into a kernel buffer by the DMA engine.
  2. No data is copied into the socket buffer. Instead, only descriptors with information about the location and length of the data are appended to the socket buffer. The DMA engine passes data directly from the kernel buffer to the protocol engine, thus eliminating the remaining final CPU copy.

Figure 5 shows the data copies using transferTo() with the gather operation:

Figure 5. Data copies when transferTo() and gather operations are used

Data copies when transferTo() and gather operations are used

 

Building a file server

Now let's put zero copy into practice, using the same example of transferring a file between a client and a server (see Download for the sample code). TraditionalClient.java and TraditionalServer.java are based on the traditional copy semantics, using File.read() andSocket.send()TraditionalServer.java is a server program that listens on a particular port for the client to connect, and then reads 4K bytes of data at a time from the socket. TraditionalClient.java connects to the server, reads (using File.read()) 4K bytes of data from a file, and sends (using socket.send()) the contents to the server via the socket.

Similarly, TransferToServer.java and TransferToClient.java perform the same function, but instead use the transferTo() method (and in turn the sendfile() system call) to transfer the file from server to client.

Performance comparison

We executed the sample programs on a Linux system running the 2.6 kernel and measured the run time in milliseconds for both the traditional approach and the transferTo() approach for various sizes. Table 1 shows the results:

Table 1. Performance comparison: Traditional approach vs. zero copy
File size Normal file transfer (ms) transferTo (ms)
7MB 156 45
21MB 337 128
63MB 843 387
98MB 1320 617
200MB 2124 1150
350MB 3631 1762
700MB 13498 4422
1GB 18399 8537

As you can see, the transferTo() API brings down the time approximately 65 percent compared to the traditional approach. This has the potential to increase performance significantly for applications that do a great deal of copying of data from one I/O channel to another, such as Web servers.

 

Summary

We have demonstrated the performance advantages of using transferTo() compared to reading from one channel and writing the same data to another. Intermediate buffer copies — even those hidden in the kernel — can have a measurable cost. In applications that do a great deal of copying of data between channels, the zero-copy technique can offer a significant performance improvement.

 

Download

Description Name Size Sample programs for this article
j-zerocopy.zip 3KB
分享到:
评论

相关推荐

    LyraNET: A Zero-Copy TCP/IP Protocol Stack for Embedded Operating的翻译

    LyraNET的创新之处在于它实现了零拷贝(Zero-Copy)技术,旨在优化这一过程,提高嵌入式设备的网络性能。 零拷贝技术的核心思想是减少数据在内存中的复制次数,尤其是在内核空间与用户空间之间的拷贝。在LyraNET中...

    什么是Zero-Copy?1

    【Zero-Copy技术详解】 Zero-Copy是一种计算机编程技术,主要应用于网络传输和I/O操作,目的是提高数据处理效率,减少CPU的负载和上下文切换次数。在传统的数据传输过程中,数据需要经过多次复制,从硬盘到用户空间...

    Linux I/O 原理和 Zero-copy 技术全面揭秘

    Linux I/O 原理和 Zero-copy 技术全面揭秘 在现代的计算机系统中,尤其是在网络服务器领域,I/O(输入/输出)已经成为决定系统性能的关键因素。由于大多数网络应用基于客户端-服务端模型,大量的数据交换使得I/O...

    java与zero-copy

    零拷贝(Zero-Copy)是一种提升系统性能的技术,它能够减少甚至避免数据在操作系统内核地址空间和用户地址空间之间的复制。在传统的I/O操作中,数据需要从文件系统读入缓冲区,再从缓冲区写入用户空间,最后通过网络...

    Using Zero-Copy In ROS2

    在ROS2中,零拷贝(Zero-Copy)技术是一种优化数据传输的方法,尤其是在处理大量传感器数据时,可以显著提高系统性能。本文将深入探讨ROS2中的零拷贝通信以及相关的技术。 首先,我们需要理解什么是零拷贝数据传输...

    [done]Zero-copy Receive for vhost.pdf

    4. 零拷贝(Zero-copy)技术: - 零拷贝技术是指数据在传输过程中避免重复拷贝,从而减少CPU资源消耗和提高数据处理速度。 - 通过零拷贝技术可以避免从网络设备拷贝数据到用户空间的内存中,再从用户空间拷贝到...

    j-zerocopy

    总结起来,"j-zerocopy"项目主要探讨了如何在Java环境中利用NIO和内存映射文件等技术实现零拷贝,以提高Socket通信的效率。通过对比传统Socket编程与零拷贝Socket编程,我们可以更好地理解这项技术的优势,并在实际...

    Design_and_implementation_of_zero-copy_data_path_for_efficient_file_transmission

    **零拷贝(Zero-Copy)**技术是指在数据传输过程中,数据在用户空间和内核空间之间的移动无需经过CPU的拷贝操作。传统的文件传输过程中,数据往往需要经历多次拷贝操作,包括但不限于从磁盘读取数据到内核缓冲区、再...

    理解Netty中的零拷贝(Zero-Copy)机制1

    "理解Netty中的零拷贝(Zero-Copy)机制1"这篇文章除了讲解Netty的核心特性——零拷贝之外,还涉及到Linux和Java的相关知识。 零拷贝技术是一种优化数据传输的方法,它减少了CPU在数据传输过程中的参与,提高了系统...

    对于 Netty ByteBuf 的零拷贝(Zero Copy) 的理解1

    Netty ByteBuf 的零拷贝(Zero-Copy)理解 Netty 中的零拷贝(Zero-Copy)是指在操作数据时,不需要将数据 buffer 从一个内存区域拷贝到另一个内存区域,这样可以减少 CPU 的负载和内存带宽的占用。 Zero-Copy 通常...

    zero-copy-socket-test

    在我的 4 核 8 线程 3.5Ghz i7-3770K 上测试 ASIO 最大化 UDP 发送和接收的能力。 首先,iperf 是这样运行的: A: iperf -s -u -l 65507 B: iperf -c 127.0.0.1 -u -l 65507 -b 1000G Linux 报告最大 UDP 环回吞吐...

    zero_copy_stream_impl.rar_Zero

    标题中的"zero_copy_stream_impl.rar_Zero"暗示了一个关于零拷贝技术在Linux API驱动中的实现。零拷贝(Zero-Copy)是一种优化I/O操作的技术,它的主要目标是减少CPU在处理数据时的内存拷贝操作,从而提高系统效率,...

    zero-copy-in-java-and-webflux:一个使用docker化的应用程序,示例在Java中复制文件并在Spring Webflux中使用和不使用“零复制”机制来提供文件

    使用pom.xml中指定的spring boot maven插件,创建Spring Boot可执行jar很简单./mvnw清洁包它会在目标文件夹(target / zerocopy-0.0.1-SNAPSHOT.jar)下生成一个重新打包的jar(可执行jar)。建立docker映像在构建...

    protobuf-zero-copy-network-stream-sample:ZeroCopyInputStreamZeroCopyOutputStream 使用预分配缓冲区处理流式套接字的示例

    protobuf 零拷贝网络流样本 这是使用预分配缓冲区处理流式套接字的 ZeroCopyInputStream 和 ZeroCopyOutputStream 的示例。 如果您有缓冲池或希望使用堆栈缓冲区,这将很有用。 笔记 记录代码被注释掉。...

    zero_copy_stream_unittest.rar_Zero

    在Linux API驱动开发中,"zero_copy_stream_unittest.rar_Zero"这个标题暗示了一个与零拷贝(Zero-Copy)技术相关的单元测试。零拷贝是一种优化数据传输的技术,旨在减少CPU在内存到I/O设备之间复制数据时的负担,...

    v4l2+drm+dmabuf实现零拷贝视频环回功能

    本人用的平台是xilinx, zynqMP,提供的资源是参考的源码,可以实现在一块板子上面自动播放视频。 HDMI v4l2进, HDMI DRM出 刚好在做音视频相关的工作,其他资源会在这段时间上传吧。赚点资源分去下载其他人资源。...

    The Need for Asynchronous, Zero-Copy Network IO - Drepper (Slides)-计算机科学

    The Need for Aynchronous, Zero­Copy  Network I/OUlrich DrepperRed Hat, Inc.The ProblemNetwork hardware changed but the socket API  stayed the same Transfer rates bigger (esp ...

    Fast Buffers

    ### Fast Buffers与Zero-Copy技术在Gigabit/s网络中的应用 #### 一、MPICH架构概述 根据所提供的信息,“Fast Buffers”这一主题主要围绕高性能计算领域中消息传递接口(Message Passing Interface, MPI)的一种...

    Research on Linux Network Device Driver.pdf

    通过修改Linux设备驱动3rd版本snul1.e并改进Copy-On-Write技术,实现了Linux内核2.6.11版本中的Zero-Copy技术。实验结果证明,对网络设备驱动的深入分析是许多应用的基础,也为实际应用,甚至是军事应用开发提供了...

    zero-copy:Java 和 C 中的零拷贝

    零拷贝 零拷贝可让您避免中间缓冲区之间的冗余数据拷贝,并减少用户空间和内核空间之间的上下文切换次数。 当您的硬件(磁盘驱动器、网卡、显卡、声卡)支持DMA (直接内存访问)时,理想的零拷贝(零 CPU 拷贝)是...

Global site tag (gtag.js) - Google Analytics