Thrift Java Servers Compared
This article talks only about Java servers. See this page if you are interested in C++ servers.
本文仅讨论Java版的Thrift server.如果你对C++版的感兴趣,请参考 这个 页面。
Thrift is a cross-language serialization/RPC framework with three major components, protocol, transport, and server. Protocol defines how messages are serialized. Transport defines how messages are communicated between client and server. Server receives serialized messages from the transport, deserializes them according to the protocol and invokes user-defined message handlers, and serializes the responses from the handlers and writes them back to the transport. The modular architecture of Thrift allows it to offer various choices of servers. Here are the list of server available for Java:
Thrift 是一个跨语言的序列化/RPC框架,它含有三个主要的组件:protocol,transport和server,其中,protocol定义了消息是怎样序列化的,transport定义了消息是怎样在客户端和服务器端之间通信的,server用于从transport接收序列化的消息,根据protocol反序列化之,调用用户定义的消息处理器,并序列化消息处理器的响应,然后再将它们写回transport。Thrift模块化的结构使得它能提供各种server实现。下面列出了Java中可用的server实现:
· TSimpleServer
· TNonblockingServer
· THsHaServer
· TThreadedSelectorServer
· TThreadPoolServer
Having choices is great, but which server is right for you? In this article, I'll describe the differences among all those servers and show benchmark results to illustrate performance characteristics (the details of the benchmark is explained in Appendix B). Let's start with the simplest one: TSimpleServer.
有多个选择很好,但是哪个适合你呢?在本文中,我将描述这些server之间的区别,并展示测试结果,以说明它们的性能特点(测试的细节在附录B中)。下面,我们就从最简单的开始:TSimpleServer。
文章来源:http://www.codelast.com/
TSimpleServer accepts a connection, processes requests from the connection until the client closes the connection, and goes back to accept a new connection. Since it is all done in a single thread with blocking I/O, it can only serve one client connection, and all the other clients will have to wait until they get accepted. TSimpleServer is mainly used for testing purpose. Don't use it in production!
TSimplerServer接受一个连接,处理连接请求,直到客户端关闭了连接,它才回去接受一个新的连接。正因为它只在一个单独的线程中以阻塞I/O的方式完成这些工作,所以它只能服务一个客户端连接,其他所有客户端在被服务器端接受之前都只能等待。TSimpleServer主要用于测试目的,不要在生产环境中使用它!
文章来源:http://www.codelast.com/
TNonblockingServer solves the problem with TSimpleServer of one client blocking all the other clients by using non-blocking I/O. It usesjava.nio.channels.Selector, which allows you to get blocked on multiple connections instead of a single connection by calling select(). The select() call returns when one ore more connections are ready to be accepted/read/written. TNonblockingServer handles those connections either by accepting it, reading data from it, or writing data to it, and calls select() again to wait for the next available connections. This way, multiple clients can be served without one client starving others.
TNonblockingServer使用非阻塞的I/O解决了TSimpleServer一个客户端阻塞其他所有客户端的问题。它使用了java.nio.channels.Selector,通过调用select(),它使得你阻塞在多个连接上,而不是阻塞在单一的连接上。当一或多个连接准备好被接受/读/写时,select()调用便会返回。TNonblockingServer处理这些连接的时候,要么接受它,要么从它那读数据,要么把数据写到它那里,然后再次调用select()来等待下一个可用的连接。通用这种方式,server可同时服务多个客户端,而不会出现一个客户端把其他客户端全部“饿死”的情况。
There is a catch, however. Messages are processed by the same thread that calls select(). Let's say there are 10 clients, and each message takes 100 ms to process. What would be the latency and throughput? While a message is being processed, 9 clients are waiting to be selected, so it takes 1 second for the clients to get the response back from the server, and throughput will be 10 requests / second. Wouldn't it be great if multiple messages can be processed simultaneously?
然而,还有个棘手的问题:所有消息是被调用select()方法的同一个线程处理的。假设有10个客户端,处理每条消息所需时间为100毫秒,那么,latency和吞吐量分别是多少?当一条消息被处理的时候,其他9个客户端就等着被select,所以客户端需要等待1秒钟才能从服务器端得到回应,吞吐量就是10个请求/秒。如果可以同时处理多条消息的话,会很不错吧?
This is where THsHaServer (Half-Sync/Half-Async server) comes into picture. It uses a single thread for network I/O, and a separate pool of worker threads to handle message processing. This way messages will get processed immediately if there is an idle worker threads, and multiple messages can be processed concurrently. Using the example above, now the latency is 100 ms and throughput will be 100 requests / sec.
因此,THsHaServer(半同步/半异步的server)就应运而生了。它使用一个单独的线程来处理网络I/O,一个独立的worker线程池来处理消息。这样,只要有空闲的worker线程,消息就会被立即处理,因此多条消息能被并行处理。用上面的例子来说,现在的latency就是100毫秒,而吞吐量就是100个请求/秒。
To demonstrate this, I ran a benchmark with 10 clients and a modified message handler that simply sleeps for 100 ms before returning. I used THsHaServer with 10 worker threads. The handler looks something like this:
为了演示,我做了一个测试,有10客户端和一个修改过的消息处理器——它的功能仅仅是在返回之前简单地sleep 100毫秒。我使用的是有10个worker线程的THsHaServer。消息处理器的代码看上去就像下面这样:
1
2
3
4
5
6
7
8
|
public ResponseCode sleep() throws TException
{ try {
Thread.sleep( 100 );
} catch (Exception ex) {
}
return ResponseCode.Success;
} |
The results are as expected. THsHaServer is able to process all the requests concurrently, while TNonblockingServer processes requests one at a time.
结果正如我们想像的那样,THsHaServer能够并行处理所有请求,而TNonblockingServer只能一次处理一个请求。
文章来源:http://www.codelast.com/
Thrift 0.8 introduced yet another server, TThreadedSelectorServer. The main difference between TThreadedSelectorServer and THsHaServer is that TThreadedSelectorServer allows you to have multiple threads for network I/O. It maintains 2 thread pools, one for handling network I/O, and one for handling request processing. TThreadedSelectorServer performs better than THsHaServer when the network io is the bottleneck. To show the difference, I ran a benchmark with a handler that returns immediately without doing anything, and measured the average latency and throughput with varying number of clients. I used 32 worker threads for THsHaServer, and 16 worker threads/16 selector threads for TThreadedSelectorServer.
Thrift 0.8引入了另一种server实现,即TThreadedSelectorServer。它与THsHaServer的主要区别在于,TThreadedSelectorServer允许你用多个线程来处理网络I/O。它维护了两个线程池,一个用来处理网络I/O,另一个用来进行请求的处理。当网络I/O是瓶颈的时候,TThreadedSelectorServer比THsHaServer的表现要好。为了展现它们的区别,我进行了一个测试,令其消息处理器在不做任何工作的情况下立即返回,以衡量在不同客户端数量的情况下的平均latency和吞吐量。对THsHaServer,我使用32个worker线程;对TThreadedSelectorServer,我使用16个worker线程和16个selector线程。
The result shows that TThreadedSelectorServer has much higher throughput than THsHaServer while maintaining lower latency.
结果显示,TThreadedSelectorServer比THsHaServer的吞吐量高得多,并且维持在一个更低的latency上。
文章来源:http://www.codelast.com/
Finally, there is TThreadPoolServer. TThreadPoolServer is different from the other 3 servers in that:
最后,还剩下 TThreadPoolServer。TThreadPoolServer与其他三种server不同的是:
· There is a dedicated thread for accepting connections.
· 有一个专用的线程用来接受连接。
· Once a connection is accepted, it gets scheduled to be processed by a worker thread in ThreadPoolExecutor.
· 一旦接受了一个连接,它就会被放入ThreadPoolExecutor中的一个worker线程里处理。
· The worker thread is tied to the specific client connection until it's closed. Once the connection is closed, the worker thread goes back to the thread pool.
· worker线程被绑定到特定的客户端连接上,直到它关闭。一旦连接关闭,该worker线程就又回到了线程池中。
· You can configure both minimum and maximum number of threads in the thread pool. Default values are 5 and Integer.MAX_VALUE, respectively.
· 你可以配置线程池的最小、最大线程数,默认值分别是5(最小)和Integer.MAX_VALUE(最大)。
This means that if there are 10000 concurrent client connections, you need to run 10000 threads. As such, it is not as resource friendly as other servers. Also, if the number of clients exceeds the maximum number of threads in the thread pool, requests will be blocked until a worker thread becomes available.
这意味着,如果有1万个并发的客户端连接,你就需要运行1万个线程。所以它对系统资源的消耗不像其他类型的server一样那么“友好”。此外,如果客户端数量超过了线程池中的最大线程数,在有一个worker线程可用之前,请求将被一直阻塞在那里。
Having said that, TThreadPoolServer performs very well; on the box I'm using it's able to support 10000 concurrent clients without any problem. If you know the number of clients that will be connecting to your server in advance and you don't mind running a lot of threads, TThreadPoolServer might be a good choice for you.
我们已经说过,TThreadPoolServer的表现非常优异。在我正在使用的计算机上,它可以支持1万个并发连接而没有任何问题。如果你提前知道了将要连接到你服务器上的客户端数量,并且你不介意运行大量线程的话,TThreadPoolServer对你可能是个很好的选择。
结论
I hope this article helps you decide which Thrift server is right for you. I think TThreadedSelectorServer would be a safe choice for most of the use cases. You might also want to consider TThreadPoolServer if you can afford to run lots of concurrent threads. Feel free to send me email atmapkeeper-users@googlegroups.com or post your comments here if you have any questions/comments.
希望本文能帮你做出决定:哪一种Thrift server适合你。我认为TThreadedSelectorServer对大多数案例来说都是个安全之选。如果你的系统资源允许运行大量并发线程的话,你可能会想考虑使用TThreadPoolServer。(后面的就不翻译了)
1
2
3
4
|
Processors: 2 x Xeon E5620 2.40GHz (HT enabled, 8 cores, 16 threads) Memory: 8GB Network: 1Gb/s <full-duplex> OS: RHEL Server 5.4 Linux 2.6.18-164.2.1.el5 x86_64 |
It's pretty straightforward to run the benchmark yourself. First clone the MapKeeper repository and compile the stub java server:
1
2
3
|
git clone git: //github .com /m1ch1/mapkeeper .git
cd mapkeeper /stubjava
make |
Then, start the server you like to benchmark:
1
2
3
4
|
make run mode=threadpool # run TThreadPoolServer
make run mode=nonblocking # run TNonblockingServer
make run mode=hsha # run THsHaServer
make run mode=selector # run TThreadedSelectorServer
|
Then, clone YCSB repository and compile:
1
2
3
|
git clone git: //github .com /brianfrankcooper/YCSB .git
cd YCSB
mvn clean package |
Once the compilation finishes, you can run YCSB against the stub server:
1
|
. /bin/ycsb load mapkeeper -P . /workloads/workloada
|
For more detailed information about how to use YCSB, check out their wiki page.
相关推荐
Thrift IDL 文件被编译成多种编程语言的代码,如 C++, Java, Python 等,这些代码实现了 RPC(远程过程调用)协议层和传输层。 Thrift 的特性包括: 1. 接口描述语言:Thrift 提供了一种类似于 Java 或 C++ 的语言...
6. **运行**:现在,你可以先启动服务器(`ThriftServer.java`),然后运行客户端(`ThriftClient.java`)。客户端将向服务器发送请求,并打印出服务器返回的问候语。 以上是基本的Thrift使用步骤。在实际项目中,...
然后,Thrift编译器会根据这个文件生成各种目标语言(如Java)的客户端和服务端代码。 在这个"Java中使用Thrift实现RPC示例代码.rar"中,我们可能看到一个名为`HelloNetty`的项目。这通常包含以下几个关键部分: 1...
在本示例中,我们将探讨如何使用Thrift在Java和Python之间实现RPC(Remote Procedure Call)的互相调用。 首先,我们需要了解Thrift IDL。在Thrift IDL文件中,我们可以定义服务接口、数据结构(如struct)和常量。...
总结起来,Thrift是实现跨语言服务调用的强大工具,通过其IDL和生成的代码,开发者可以轻松地在Java和Python等不同语言之间建立高效、可靠的通信。这个例子提供了学习和实践Thrift的一个基础起点,对于理解跨语言...
本文将基于Thrift的Java实现,总结学习过程中的一些关键知识点,旨在帮助理解Thrift的工作原理以及如何在Java环境中应用。 一、Thrift简介 Thrift是一种远程过程调用(RPC)框架,它通过定义一种中间描述文件(....
- "thrift-server 最简单的测试案例"表明,项目可能包含一个简单的服务器实现和对应的测试用例,用来验证Thrift服务的正确性。 - 测试通常使用JUnit或类似框架,模拟客户端调用,检查服务器的响应是否符合预期。 ...
在"Thrift双向通讯java代码"这个主题中,我们主要讨论如何使用Thrift在Java环境中实现客户端和服务器端的双向通信。双向通信意味着服务器和客户端都可以向对方发送请求并接收响应,这对于构建复杂的分布式系统至关...
通过这个Java Thrift Demo,我们可以深入理解Thrift如何在Java中实现RPC通信,包括服务定义、代码生成、服务端实现、客户端调用等关键步骤。这对于初学者来说,是一个很好的起点,有助于进一步学习和应用Thrift进行...
在IT行业中,Thrift是一种高性能、可扩展的跨语言服务开发框架,由Facebook开源,它允许定义服务接口,然后自动生成各种编程语言的代码,使得服务提供者和消费者可以使用不同的语言进行通信。Spring框架则是Java领域...
2. **代码生成**:使用Thrift编译器将IDL文件转换为各种目标语言的源代码,包括Java。这会生成服务接口类(如`Calculator.java`)、协议处理类以及数据结构类。 3. **服务实现**:在Java中,你需要实现生成的服务...
4. **安全性**:ThriftServer可以通过Kerberos、SSL等机制实现身份验证和加密,确保数据传输的安全性。 5. **配置优化**:为了适应不同的工作负载,ThriftServer有多种配置选项,如并发查询设置、执行超时、内存...
在Java环境中,Thrift提供了服务器和客户端的实现,使得Java应用能够方便地建立起高效的通信机制。下面我们将深入探讨Thrift在Java环境中的服务器和客户端通信过程。 首先,我们需要定义服务接口。这通常通过编写一...
Thrift通过定义一种中间语言(IDL,Interface Description Language)来描述服务接口和数据结构,然后自动生成各种编程语言的客户端和服务器端代码,简化了跨平台通信的实现。 1. 创建Thrift文件 Thrift文件是整个...
在本实例中,我们将关注一个基于Thrift的RPC调用实现,Thrift是由Facebook开发的一种高效的跨语言服务开发框架。 Thrift的核心思想是定义一种中间描述文件(.thrift),该文件包含了服务接口、数据结构以及服务间的...
压缩包分为三个部分,java工程(java的服务方,和java客户端) php工程,php的客户端 linux环境的服务方部署包以及脚本 其中myserver.tar.gz压缩包是linux环境的部署包, 在linux环境先解压, tar -zxvf myserver.tar.gz ...
这个实例包含了Thrift的使用方法和必要的库文件,帮助开发者在不同的编程语言之间实现高效、简洁的数据通信。Thrift通过定义一种中间表示(IDL,Interface Definition Language)来描述服务接口,然后自动生成相应的...
通过Thrift,你可以用Java编写服务端,而用Python、C++或任何支持Thrift的其他语言编写客户端,实现无缝交互。Thrift还提供了许多优化功能,如多线程处理、异步调用和高级数据序列化选项,使得它成为构建分布式系统...
9. **源码分析**:通过查看源码,我们可以学习Thrift的内部实现,包括编译器生成代码的模板、各种协议的实现细节以及服务器和客户端的运行机制。 10. **应用场景**:Thrift广泛应用于分布式系统,如大数据处理、...
在HBase中,Thrift Server扮演着桥梁的角色,它提供了一个非Java的API接口,使得非Java应用(如Hue)可以与HBase进行交互。 为了解决这个问题,首先需要在HBase服务中添加Thrift Server的角色实例。这通常涉及到...