原文转载于:https://github.com/
Thrift Java Servers Compared
This article talks only about Java servers. See this page if you are interested in C++ servers.
Thrift is a cross-language serialization/RPC framework with three major components: protocol, transport, and server. Protocol defines how messages are serialized. Transport defines how messages are communicated between client and server. Server receives serialized messages from the transport, deserializes them according to the protocol and invokes user-defined message handlers, and serializes the responses from the handlers and writes them back to the transport. The modular architecture of Thrift allows it to offer various choices of servers. Here are the list of server available for Java:
Having choices is great, but which server is right for you? In this article, I'll describe the differences among all those servers and show benchmark results to illustrate performance characteristics (the details of the benchmark is explained in Appendix B). Let's start with the simplest one: TSimpleServer.
TSimpleServer
TSimpleServer accepts a connection, processes requests from the connection until the client closes the connection, and goes back to accept a new connection. Since it is all done in a single thread with blocking I/O, it can only serve one client connection, and all the other clients will have to wait until they get accepted. TSimpleServer is mainly used for testing purpose. Don't use it in production!
TNonblockingServer vs. THsHaServer
TNonblockingServer solves the problem with TSimpleServer of one client blocking all the other clients by using non-blocking I/O. It uses java.nio.channels.Selector
, which allows you to get blocked on multiple connections instead of a single connection by calling select()
. The select()
call returns when one ore more connections are ready to be accepted/read/written. TNonblockingServer handles those connections either by accepting it, reading data from it, or writing data to it, and calls select()
again to wait for the next available connections. This way, multiple clients can be served without one client starving others.
There is a catch, however. Messages are processed by the same thread that calls select()
. Let's say there are 10 clients, and each message takes 100 ms to process. What would be the latency and throughput? While a message is being processed, 9 clients are waiting to be selected, so it takes 1 second for the clients to get the response back from the server, and throughput will be 10 requests / second. Wouldn't it be great if multiple messages can be processed simultaneously?
This is where THsHaServer (Half-Sync/Half-Async server) comes into picture. It uses a single thread for network I/O, and a separate pool of worker threads to handle message processing. This way messages will get processed immediately if there is an idle worker threads, and multiple messages can be processed concurrently. Using the example above, now the latency is 100 ms and throughput will be 100 requests / sec.
To demonstrate this, I ran a benchmark with 10 clients and a modified message handler that simply sleeps for 100 ms before returning. I used THsHaServer with 10 worker threads. The handler looks something like this:
public ResponseCode sleep() throws TException
{
try {
Thread.sleep(100);
} catch (Exception ex) {
}
return ResponseCode.Success;
}
The results are as expected. THsHaServer is able to process all the requests concurrently, while TNonblockingServer processes requests one at a time.
THsHaServer vs. TThreadedSelectorServer
Thrift 0.8 introduced yet another server, TThreadedSelectorServer. The main difference between TThreadedSelectorServer and THsHaServer is that TThreadedSelectorServer allows you to have multiple threads for network I/O. It maintains 2 thread pools, one for handling network I/O, and one for handling request processing. TThreadedSelectorServer performs better than THsHaServer when the network io is the bottleneck. To show the difference, I ran a benchmark with a handler that returns immediately without doing anything, and measured the average latency and throughput with varying number of clients. I used 32 worker threads for THsHaServer, and 16 worker threads/16 selector threads for TThreadedSelectorServer.
The result shows that TThreadedSelectorServer has much higher throughput than THsHaServer while maintaining lower latency.
TThreadedSelectorServer vs. TThreadPoolServer
Finally, there is TThreadPoolServer. TThreadPoolServer is different from the other 3 servers in that:
- There is a dedicated thread for accepting connections.
- Once a connection is accepted, it gets scheduled to be processed by a worker thread in
ThreadPoolExecutor
. - The worker thread is tied to the specific client connection until it's closed. Once the connection is closed, the worker thread goes back to the thread pool.
- You can configure both minimum and maximum number of threads in the thread pool. Default values are 5 and
Integer.MAX_VALUE
, respectively.
This means that if there are 10000 concurrent client connections, you need to run 10000 threads. As such, it is not as resource friendly as other servers. Also, if the number of clients exceeds the maximum number of threads in the thread pool, requests will be blocked until a worker thread becomes available.
Having said that, TThreadPoolServer performs very well; on the box I'm using it's able to support 10000 concurrent clients without any problem. If you know the number of clients that will be connecting to your server in advance and you don't mind running a lot of threads, TThreadPoolServer might be a good choice for you.
Conclusion
I hope this article helps you decide which Thrift server is right for you. I think TThreadedSelectorServer would be a safe choice for most of the use cases. You might also want to consider TThreadPoolServer if you can afford to run lots of concurrent threads. Feel free to send me email at mapkeeper-users@googlegroups.com or post your comments here if you have any questions/comments.
Appendix A: Hardware Configuration
Processors: 2 x Xeon E5620 2.40GHz (HT enabled, 8 cores, 16 threads)
Memory: 8GB
Network: 1Gb/s <full-duplex>
OS: RHEL Server 5.4 Linux 2.6.18-164.2.1.el5 x86_64
Appendix B: Benchmark Details
It's pretty straightforward to run the benchmark yourself. First clone the MapKeeper repository and compile the stub java server:
git clone git://github.com/m1ch1/mapkeeper.git
cd mapkeeper/stubjava
make
Then, start the server you like to benchmark:
make run mode=threadpool # run TThreadPoolServer
make run mode=nonblocking # run TNonblockingServer
make run mode=hsha # run THsHaServer
make run mode=selector # run TThreadedSelectorServer
Then, clone YCSB repository and compile:
git clone git://github.com/brianfrankcooper/YCSB.git
cd YCSB
mvn clean package
Once the compilation finishes, you can run YCSB against the stub server:
./bin/ycsb load mapkeeper -P ./workloads/workloada
For more detailed information about how to use YCSB, check out their wiki page.
相关推荐
本文将详细介绍如何使用 Thrift 在 Java 环境下构建 `.jar` 文件,以便在不同的 Java 应用中使用 Thrift 生成的服务。 1. **安装 Thrift** 首先,你需要在本地安装 Thrift 编译器。访问 Thrift 官方网站...
在Java环境中,Thrift提供了服务器和客户端的实现,使得Java应用能够方便地建立起高效的通信机制。下面我们将深入探讨Thrift在Java环境中的服务器和客户端通信过程。 首先,我们需要定义服务接口。这通常通过编写一...
在这个"thrift的java和python结合例子"中,我们将探讨如何使用Thrift在Java和Python之间建立通信。 首先,Thrift通过定义接口描述文件(.thrift)来规范服务的接口。这个文件使用Thrift IDL(Interface Description...
在“thrift_java_demo和安装包”中,我们主要关注两个方面:Thrift的安装和Java的测试项目。 1. **Thrift的安装**: - 下载:你提供的压缩包文件名是`thrift-0.9.3`,这表明它包含了Thrift的特定版本0.9.3。首先,...
本文将基于Thrift的Java实现,总结学习过程中的一些关键知识点,旨在帮助理解Thrift的工作原理以及如何在Java环境中应用。 一、Thrift简介 Thrift是一种远程过程调用(RPC)框架,它通过定义一种中间描述文件(....
此文件为自己在Mac电脑上写的thrift的demo,使用maven 管理了小程序,并用thrift生成了源码,其中对thrift生成的源码,把override注销了,其他的没处理,如果感兴趣参考博客:...
在本文中,我们将深入探讨如何使用Java通过Thrift2接口操作HBase数据库。HBase是一个分布式、可扩展的大数据存储系统,它构建于Hadoop之上,支持实时读写。Thrift是一个轻量级的框架,用于跨语言服务开发,允许不同...
在"Thrift双向通讯java代码"这个主题中,我们主要讨论如何使用Thrift在Java环境中实现客户端和服务器端的双向通信。双向通信意味着服务器和客户端都可以向对方发送请求并接收响应,这对于构建复杂的分布式系统至关...
Thrift的编译器会根据这个定义生成Java、Python、C++等语言的客户端和服务端代码。 在“thrift实现http协议案例”中,由于Thrift默认只支持基于TCP的socket通信,而我们需要实现HTTP协议,所以需要自定义处理。这里...
通过Thrift的编译器,这个IDL文件会被转换成Java代码,生成服务接口类(如`Calculator.java`)和服务处理类(如`CalculatorHandler.java`)。服务处理类实现了服务接口,具体执行业务逻辑。然后,开发者可以通过...
在给定的压缩包文件中,包含了一系列与 Thrift 相关的 Java 包,这将有助于构建基于 Thrift 的分布式系统。 1. **Thrift IDL**: Thrift Interface Definition Language 是一种类似 protobuf 的语言,用于定义服务...
在Java领域,Thrift 提供了一种简洁的方式来实现远程过程调用(RPC)服务。在这个实战源码中,我们有两个主要的部分:`thrift-server` 和 `thrift-client`,分别代表服务端和客户端的实现。 服务端(thrift-server...
在Java开发中,Thrift是一种高效、跨语言的服务框架,由Facebook开源,现已被Apache基金会维护。Thrift通过定义一种中间表示(IDL,接口定义语言)来描述服务,然后自动生成对应语言的客户端和服务器端代码,使得...
thrift-0.9.2 for java 的依赖包,libthrift-0.9.2.jar
主要是对thrift0.9.0 TSimpleServer、TThreadPoolServer 、TNonblockingServer、THsHaServer等服务模型实例和AsynClient 异步客户端实例代码的演示
thrift 生成的java包httpclient-4.1.2.jar
《Thrift下Java服务器与客户端开发指南》 Thrift是一种跨语言的服务开发框架,由Facebook开发,后来成为Apache基金会的开源项目。它主要用于构建高效、可扩展的分布式系统。Thrift通过定义一种中间语言(IDL,...
标题“thrift java hello”指的是使用Thrift在Java环境中实现一个简单的Hello World示例。这个过程通常包括以下步骤: 1. **定义服务**: 首先,我们需要使用Thrift IDL编写一个服务接口。例如,创建一个名为`...
Thrift作为可伸缩的跨语言服务开发框架,网上的资源较少,这里是一个简单的入门小程序,文件中的mylib下包含了依赖的jar包,并且在file目录下放了一个简单的thrift...gen java Hello.thrift 命令就能生成对应的工具类
Thrift 支持多种编程语言,如 C++, Java, PHP, Python 等,使得在这些语言之间进行数据交换变得简单而高效。它提供了一种二进制协议,相比 JSON 或 XML,在性能和传输大小上有显著优势,特别适合大型系统的数据交互...