Java Best Practices – Char to Byte and Byte to Char conversions
从 Java Code Geeks 作者:Justin Cater
有 1 人喜欢此条目
Continuing our series of articles concerning proposed practices while working with the Java programming language, we are going to talk about String performance tunning. Especially we will focus on how to handle character to byte and byte to character conversions efficiently when the default encoding (UTF-8) is used. This article concludes with a performance comparison between two proposed custom approaches and two classic ones (the "String.getBytes()" and the NIO ByteBuffer) for converting characters to bytes and vice – versa.
All discussed topics are based on use cases derived from the development of mission critical, ultra high performance production systems for the telecommunication industry.
Prior reading each section of this article it is highly recommended that you consult the relevant Java API documentation for detailed information and code samples.
All tests are performed against a Sony Vaio with the following characteristics :
* System : openSUSE 11.1 (x86_64)
* Processor (CPU) : Intel(R) Core(TM)2 Duo CPU T6670 @ 2.20GHz
* Processor Speed : 1,200.00 MHz
* Total memory (RAM) : 2.8 GB
* Java : OpenJDK 1.6.0_0 64-Bit
The following test configuration is applied :
* Concurrent worker Threads : 1
* Test repeats per worker Thread : 1000000
* Overall test runs : 100
Char to Byte and Byte to Char conversions
Character to byte and byte to character conversions are considered common tasks among Java developers who are programming against a networking environment, manipulate streams of byte data, serialize String objects, implementing communication protocols etc. For that reason Java provides a handful of utilities that enable a developer to convert a String (or a character array) to its byte array equivalent and vice versa.
The “getBytes(charsetName)” operation of the String class is probably the most commonly used method for converting a String into its byte array equivalent. Since every character can be represented differently according to the encoding scheme used, its of no surprise that the aforementioned operation requires a “charsetName” in order to correctly convert the String characters. If no “charsetName” is provided, the operation encodes the String into a sequence of bytes using the platform's default character set (UTF-8).
Another “classic” approach for converting a character array to its byte array equivalent is by using the ByteBuffer class of the NIO package. An example code snippet for the specific approach will be provided later on.
Both the aforementioned approaches although very popular and indisputably easy to use and straightforward greatly lack in performance compared to more fine grained methods. Keep in mind that we are not converting between character encodings. For converting between character encodings you should stick with the “classic” approaches using either the “String.getBytes(charsetName)” or the NIO framework methods and utilities.
When all characters to be converted are ASCII characters, a proposed conversion method is the one shown below :
public static byte[] stringToBytesASCII(String str) {
char[] buffer = str.toCharArray();
byte[] b = new byte[buffer.length];
for (int i = 0; i < b.length; i++) {
b[i] = (byte) buffer[i];
}
return b;
}
The resulted byte array is constructed by casting every character value to its byte equivalent since we know that all characters are in the ASCII range (1 – 127) thus can occupy just one byte in size.
Using the resulted byte array we can convert back to the original String, by utilizing the “classic” String constructor “new String(byte[])”
For UTF-8 (the default character encoding in Java) characters we can use the methods shown below to convert a String to a byte array and vice – versa :
public static byte[] stringToBytesUTFCustom(String str) {
char[] buffer = str.toCharArray();
byte[] b = new byte[buffer.length << 1];
for(int i = 0; i < buffer.length; i++) {
int bpos = i << 1;
b[bpos] = (byte) ((buffer[i]&0xFF00)>>8);
b[bpos + 1] = (byte) (buffer[i]&0x00FF);
}
return b;
}
Every character type in Java occupies 2 bytes in size. For converting a String to its byte array equivalent we convert every character of the String to its 2 byte representation.
Using the resulted byte array we can convert back to the original String, by utilizing the method provided below :
public static String bytesToStringUTFCustom(byte[] bytes) {
char[] buffer = new char[bytes.length >> 1];
for(int i = 0; i < buffer.length; i++) {
int bpos = i << 1;
char c = (char)(((bytes[bpos]&0x00FF)<<8) + (bytes[bpos+1]&0x00FF));
buffer[i] = c;
}
return new String(buffer);
}
We construct every String character from its 2 byte representation. Using the resulted character array we can convert back to the original String, by utilizing the “classic” String constructor “new String(char[])”
Last but not least we provide two example methods using the NIO package in order to convert a String to its byte array equivalent and vice – versa :
public static byte[] stringToBytesUTFNIO(String str) {
char[] buffer = str.toCharArray();
byte[] b = new byte[buffer.length << 1];
CharBuffer cBuffer = ByteBuffer.wrap(b).asCharBuffer();
for(int i = 0; i < buffer.length; i++)
cBuffer.put(buffer[i]);
return b;
}
public static String bytesToStringUTFNIO(byte[] bytes) {
CharBuffer cBuffer = ByteBuffer.wrap(bytes).asCharBuffer();
return cBuffer.toString();
}
For the final part of this article we provide the performance comparison charts for the aforementioned String to byte array and byte array to String conversion approaches. We have tested all methods using the input string “a test string”.
First the String to byte array conversion performance comparison chart :
The horizontal axis represents the number of test runs and the vertical axis the average transactions per second (TPS) for each test run. Thus higher values are better. As expected, both “String.getBytes()” and “stringToBytesUTFNIO(String)” approaches performed poorly compared to the “stringToBytesASCII(String)” and “stringToBytesUTFCustom(String)” suggested approaches. As you can see, our proposed methods achieve almost 30% increase in TPS compared to the “classic” methods.
Lastly the byte array to String performance comparison chart :
The horizontal axis represents the number of test runs and the vertical axis the average transactions per second (TPS) for each test run. Thus higher values are better. As expected, both “new String(byte[])” and “bytesToStringUTFNIO(byte[])” approaches performed poorly compared to the “bytesToStringUTFCustom(byte[])” suggested approach. As you can see, our proposed method achieved almost 15% increase in TPS compared to the “new String(byte[])” method, and almost 30% increase in TPS compared to the “bytesToStringUTFNIO(byte[])” method.
In conclusion, when you are dealing with character to byte or byte to character conversions and you do not intent to change the encoding used, you can achieve superior performance by utilizing custom – fine grained – methods rather than using the “classic” ones provided by the String class and the NIO package. Our proposed approach achieved an overall of 45% increase in performance compared to the “classic” approaches when converting from a UTF-8 encoded String to its byte array equivalent and vice – versa.
Happy coding
Justin
分享到:
相关推荐
Learn software engineering and coding best practices to write Python code right and error free. In this book you’ll see how to properly debug, organize, test, and maintain your code, all of which ...
java database best practices
Java Enterprise Best PracticesJava Enterprise Best Practices
4. SAP Best Practices for SAP SuccessFactors - SAP SuccessFactors Best Practices offer preconfigured HR processes and content to help organizations manage their workforce more efficiently and ...
Best Practices for Upgrades to Oracle Database 11g Release 2 CN
开发者应熟悉JAAS(Java Authentication and Authorization Service)和JCE(Java Cryptography Extension)。 13. **微服务架构**:近年来,微服务架构逐渐流行,每个服务都独立部署和升级,降低了系统复杂性。...
Vue.js Design Patterns and Best Practices Vue.js Design Patterns and Best Practices
Perl Best Practices offers coherent and widely applicable suggestions based on real-world experience of how code is actually written, rather than on someone's ivory-tower theories on howsoftware ought...
The Practical, Comprehensive Guide to Applying Cybersecurity Best Practices and Standards in Real Environments. In Effective Cybersecurity, William Stallings introduces the technology, operational ...
Data Visualization — Best Practices and Foundations.pdf Data Visualization — Best Practices and Foundations.pdf
High Performance Spark Best Practices for Scaling and Optimizing Apache Spark 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
Finally, you will learn to connect your app to social media and explore deployment patterns and best publishing and monetizing practices. What you will learn Build a simple app and run it on real ...
React Design Patterns and Best Practices 英文mobi 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索此书
Taking a complete journey through the most valuable design patterns in React, this book demonstrates how to apply design patterns and best practices in real-life situations, whether that's for new or ...
Java EE 8 Design Patterns and Best Practices是一本详细探讨了Java企业级版本8(Java EE 8)在企业级应用开发中应用设计模式和最佳实践的书籍。本书旨在帮助开发者利用Java EE的最新特性来构建可扩展的企业级应用...
本书首先介绍了Selenium IDE的基础知识,接着逐步引导读者过渡到实际编程语言(如Ruby或Java)的应用。通过重构代码和掌握未来网站开发所需的技能来提高代码质量,确保测试套件能够适应未来的挑战。书中提供了大量...
Django Design Patterns and Best Practices 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索此书