At Nominet we have successfully migrated the remoting protocol governing communication between our client applications and the software running in the middleware layer. We have moved from SOAP to Hessian, refactoring the base code on the way. This post explains why we migrated, the advantages of the migration and how we tested different protocols to decide the move.
Starting point
Due to not relevant reasons at the moment, Nominet middleware services were being exposed to client applications internally using Glue, a privative SOAP implementation. Middleware code was quite tied to the use of this library, fact that was not convenient at all because it made difficult to introduce a different protocol.
SOAP protocol is good when you need to expose services in an environment where you do not control the client side and want to provide hight compatibility. SOAP it is also good if you expose small stateless services with fast processing and small amount of data transfer in and out. However, due to the heavy process of XML, SOAP is not appropriate for services which can require a significant amount of information going in and out. Additionally, in an environment where you control server and client code, there are better communication protocol options.
In our environment, client–middleware communication required huge amounts of data going back and forth and using SOAP was doing more harm than good. We were not comfortable with this situation and started to study the possibility to move to an open source light protocol, taking the opportunity of refactoring the code in order to reduce dependency on the chosen communication protocol.
Protocol testing and comparison
In order to decide which protocol we would like to implement we first researched a bit on the available options and after pin picking a small group we tested them in our environment. The tests were done in two phases, first we tested all the initially considered protocols with a light load and compared them. With the first phase test results in hand, we discussed the convenience of using a Java only protocol or one that allowed a mixture of client platforms. Then we decided to test further the best protocol within each category. Therefore, in the second phase we tested just two protocols, a Java only one and another allowing multiple types of client to see how they behaved on heavy loads.
We initially looked at the following protocols:
- Hessian
Lightweight binary protocol from Caucho, HTTP-based, Custom binary serialization mechanism. Support for several platforms PHP / Python / C++ / C# / Objective C / Ruby / Java.
- Burlap
XML-based lightweight protocol from Caucho, HTTP-based, Custom XML based serialization mechanism, Do not know support different than for Java
- Spring HttpInvoker
Spring Java-to-Java remoting, HTTP-based, Java serialization just like RMI, easy to set up.
- RMI JRMP Protocol
Java remoting standard, each method needs to throw a checked RemoteException and need to generate stubs and skeletons.
- Glue SOAP
Web-Methods HTTP-based web services (was the current implementation). Support many different platforms.
Burlap was discarted before testing in favour of hessian and the resulting set was tested and compared.
First phase of protocol testing
It was intended to achieve a performance comparison within the current infrastructure just changing the wire protocol so each protocol could be compared with the current implementation using GLUE.
GLUE tests measurements were performed using the current code at the same repository release version in which the code was branched to perform necessary changes to introduce protocol independence.
For the other protocols the same test was run with the additional infrastructure to support protocol independence. Therefore the tests of the rest of the protocols were in the same conditions as the Glue test or even with a slightly overhead due to the additional layer of code.
The server and client for the test run into two JVMs on the same host to avoid network perturbations. JIT and GC are activated in order to obtain the best performance and to evaluate the common case.
The test consisted on repeated calls to the middleware services passing a structure with increasing complexity. The called method just replicates the structure and returns the copy.
The structure passed in each call is a 4 elements arraylist in which the first element is a 50 char fixed string, the second is an n-element integer arraylist, the third is an n-element UserDetails objects arraylist and the last is an n-element TestStructureBean objects arraylist. The array length shown in the data are the number of elements on the arraylists, so for a 200 array length the structure will have:
- 1 String
- 200 Integer
- 200 UserDetail objects
Once the structure is built for each iteration, the service is then called repeatedly using the same structure in each call. The service is called for an initial amount of times before taking any time measure to allow caching and afterwards, the elapsed time for 100 calls is measured.
Test results
Then resulting time of 100 iterations for increasing lengths of the structure is shown in the table below and also in two charts, the first one plotting the data for all protocols and the second one plotting the same data but with Glue protocol removed.
.
.
.
.
Conclusion
As can be read from the data and the first chart, Glue has a strong processing overhead. When the transferred data increases on size, its response time grows much faster than the rest of protocols. The other protocols response time have a moderated grow accompanying the data size increase.
Within the other three protocols, the fastest one is the HTTPInvoker in all cases except with tiny data sizes. Hessian has a good performance rating for small and medium data sizes, performing even better than RMI but its response time degrades when the data size increases.
HTTPInvoker is then a good choice to implement Java to Java remoting and Hessian is adequate if there are other languages involved in the client side.
Second phase of protocol testing
With the results from the first testing round in hand, we decided to focus on the best two suitable options for our purposes, therefore we chose HTTP invoker and Hessian and tested them further with heavier loads.
Two different tests were performed in this phase, the first one is the same performed in the first testing phase but involving longer arrays (thousands of elements) but fewer call iterations for each array length. The second test consists in calling a method with a big String as a parameter and receiving the same String back as a result. For this test we used Strings ranging from 100k characters to 550k characters. In both cases the tests performed 10 call iterations.
Test Results
Then resulting measured time of 10 iterations for different structure lengths and document sizes are shown in the tables below. Data is also shown graphically in two bar charts and for the benefit of comparison two more charts are provided, displaying the response time ratio between Hessian and HTTP Invoker for the heavy structure and big documents tests.
.
.
.
.
.
.
.
Conclusion
HTTPInvoker was, as expected, faster than Hessian, however the idea of this tests was to check how both protocols compare for big data chunks from two points of view: A big structure formed by many small objects or a big chunk of data like a String of thousands of characters.
As can be seen in the ratio charts, Hessian performs better when transferring data in one single piece. Despite the fact that the two ratio charts are not directly comparable, it is worth noting that for structures of many small objects the ratio is almost alway over 1.6 while for single big objects the ratio is mainly below 1.6.
We finally decided to use Hessian because it allows a range of client platforms and we have a small percentage of non Java applications supported by Hessian which will access the middleware services. We could actually expose our middleware services with both protocols at the same time, but the performance gain would not compensate the system over complication that this would represent. Despite the fact that Hessian is in average a 50% slower than Http Invoker it will be for sure a huge improvement when compared to Glue.
<!---->April 10th, 2007 at 3:44 pm
Very nice report.
I didn’t expect HTTP Invoker to be faster than RMI!
Johan.
<!---->July 18th, 2007 at 9:12 am
This is indeed a nice and helpfull report. A supplemental measure which still has to be added is network latency. I presume the test is done on LAN. Recent experience is that impact on using long distance WAN-connections is also important.
<!---->July 18th, 2007 at 10:07 am
Network latency did not affect the results as the server and client for the test run into two JVMs on the same host to avoid network perturbations. Obviously there is always some amount of latency but it certainly is not of significance for the size of the samples.
<!---->October 30th, 2007 at 10:15 am
Really useful report, it confirms the good work of HttpInvoker in using Java serialization algorithm, as in RMI.
<!---->January 7th, 2008 at 8:39 am
Very good for reference but if you could post your source code done for this test would be best
<!---->January 8th, 2008 at 4:24 am
[…] Nominet’s Protocol Benchmarks: Interesting benchmark of many of the protocols here considered. Intriguingly, HttpInvoker is found to have consistently better performance than RMI/JRMP, a finding which contradicts our results. […]
<!---->October 3rd, 2008 at 11:29 am
[…] is a binary web service protocol. It’s quite efficient (see Daniel Gredler’s and Miquel’s benchmarks) and it’s dynamically […]
<!---->October 3rd, 2008 at 10:07 pm
[…] is a binary web service protocol. It’s quite efficient (see Daniel Gredler’s and Miquel’s benchmarks) and it’s dynamically […]