Default Character encoding in Javaor charset is the character encoding used by JVM to convert bytes into Strings or characters when youdon't define java system property "file.encoding". Java gets character encoding by calling System.getProperty("file.encoding","UTF-8") at the time of JVM start-up. So if Java doesn't get any file.encoding attribute it uses "UTF-8" character encoding for all practical purpose e.g. on String.getBytes() or Charset.defaultCharSet().
Most important pointto remember is that Java caches character encoding or value of system property "file.encoding" in most of its core classes like InputStreamReader which needs character encoding after JVM started. so if you change system property "file.encoding" programatically you don't see desired effect and that's why you should always work with your own character encoding provided to your application and if its need to be set than set character encoding or charset while you start JVM.In this Java tutorial we will see couple of different way by which we can set default character encoding or charset of Java and how toretrieve value of charset inside java program.
Default Character encoding or Charset in Java
This article is in continuation of my post on Java String like Why String is immutable in Java or How SubString method works in java. If you haven’t read those you may find interesting.
What is character encoding in Java
For those who are not very familiar with character encoding or char-set in Javahere is a layman's introduction "since every data in computer is represented in bytes and Strings are essentially collection of charaters, so to convert bytes into character JVM needs to knowwhich combination of byte represent which character and this is what character encoding tells JVM. Since there are many languagesin world other than English like Hindi, Mandarin, Japanese Kanji etc and so many characters, same combination of bytes can represent different characters in different character encoding andthat's why using correct character encoding is must while converting bytes into String in Java".
How to get Default character encoding in Java ?
There are multiple ways to get default character encoding in Java like by using system property “file.encoding” or by using java.nio.CharSet class. You can choose whatever suits your need. Let’s see them in detail.
1) "file.encoding" system property
Easiest way to get default character encoding in Java is to call System.getProperty("file.encoding"), which will returndefault character encoding if JVM started with -Dfile.encoding property or program has not called System.setProperty("file.encoding, encoding). in later case it may just give value of that system property while various
2) java.nio.Charset
java.nio.Charset provides a convenient static method Charset.defaultCharset() which returns default character encoding in Java. Check example of getting default char encoding in java using Charset in code section.
3) by using Code InputStreamReader.getEncoding()
This is kind of shortcut where you use default constructor of InputStreamReader and than later gets which character encoding ithas used by calling reader.getEncoding() . See the code example of how to get default character encoding using InputStreamReader.getEncoding() method in code section.
How to set Default character encoding in Java ?
Just like different ways of getting default character encoding or charset in Java there are many ways to set default charset in Java. Here are some of the way:
1. Using System property "file.encoding"
by providing file.encoding system property when JVM starts e.g. java -Dfile.encoding="UTF-8" HelloWorld.
2. Using Environment variable "JAVA_TOOLS_OPTIONS"
If by anyway you don't have control how JVM starts up may be JVM is starting through some scripts which doesn't provide anyway to accept system properties. you can set environment variable JAVA_TOOL_OPTIONS to -Dfile.encoding="UTF-16" or any othercharacter encoding and it will picked up any JVM starts in your windows machine. JVM will also print "Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF16" on console to indicate that it has picked JAVA_TOOS_OPTIONS. here is example of setting default characterencoding using JAVA_TOOLS_OPTIONS
test@system:~/java java HelloWorld
þÿExecuting HelloWorld
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF16
Code Example to Get and Set Default Character Encoding Java
Here is code example of getting and setting default character encoding in Java:
import java.io.ByteArrayInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
publicclass CharacterEncodingExample {
publicstaticvoid main(String args[]) throws FileNotFoundException, UnsupportedEncodingException, IOException {
String defaultCharacterEncoding = System.getProperty("file.encoding");
System.out.println("defaultCharacterEncoding by property: " + defaultCharacterEncoding);
System.out.println("defaultCharacterEncoding by code: " + getDefaultCharEncoding());
System.out.println("defaultCharacterEncoding by charSet: " + Charset.defaultCharset());
System.setProperty("file.encoding", "UTF-16");
System.out.println("defaultCharacterEncoding by property after updating file.encoding : " + System.getProperty("file.encoding"));
System.out.println("defaultCharacterEncoding by code after updating file.encoding : " + getDefaultCharEncoding());
System.out.println("defaultCharacterEncoding by java.nio.Charset after updating file.encoding : " + Charset.defaultCharset());
}
publicstatic String getDefaultCharEncoding(){
byte [] bArray = {'w'};
InputStream is = new ByteArrayInputStream(bArray);
InputStreamReader reader = new InputStreamReader(is);
String defaultCharacterEncoding = reader.getEncoding();
return defaultCharacterEncoding;
}
}
Output:
defaultCharacterEncoding by property: UTF-8
defaultCharacterEncoding by code: UTF8
defaultCharacterEncoding by charSet: UTF-8
defaultCharacterEncoding by property after updating file.encoding : UTF-16
defaultCharacterEncoding by code after updating file.encoding : UTF8
defaultCharacterEncoding by java.nio.Charset after updating file.encoding : UTF-8
Important points to note:
1) JVM caches value of default character encoding once JVM starts and so is the case for default constructors of InputStreamReader and other core Java classes. So calling System.setProperty("file.encoding" , "UTF-16") may not have desire effect.
2) Always work with your own character encoding if you can, that is more accurate and precise way of converting bytes to Strings.
That’s all on how to get default character encoding in Java and how to set it. This becomes more important when you are
writing international application which supports multiple languages. I indeed come across character encoding issues while
writing reports in kanji (Japanese) language which I plan to share in another post, but good knowledge of Character
encoding like UTF-8, UTF-16 or ISO-8859-5 and how Java supports Character Encoding in
String will certainly help.
相关推荐
How to Get Ideas - Jack Foster.pdf
在其他操作系统(如Linux或macOS)上,你需要使用不同的方法,例如通过JNI调用特定操作系统的系统调用来获取这些信息,或者使用跨平台的库,如JNA(Java Native Access)或SWIG(Simplified Wrapper and Interface ...
Get an easy introduction to reactive streams in Java to handle concurrency, data streams, and the propagation of change in today's applications. This compact book includes in-depth introductions to ...
Think Java: How to Think Like a Computer Scientist by Allen B. Downey, Chris Mayfield 2016 | ISBN: 1491929561 Currently used at many colleges, universities, and high schools, this hands-on ...
C++ How to Program presents leading-edge computing technologies in a friendly manner appropriate for introductory college course sequences, based on the curriculum recommendations of two key ...
Now revised to reflect the innovations of Java 5.0, Goodrich and Tamassia’s Fourth Edition of Data Structures and Algorithms in Java continues to offer accessible coverage of fundamental data ...
It introduces how to setup for OpenMP in Visual Studio 2005 with Inter Fortran 10.1. With a simple 'Hello world ' example
How to write and publish a scientific paper ContentsChapter 1 What Is Scientific Writing? Chapter 2 Origins of Scientific Writing Chapter 3 What Is a Scientific Paper? Chapter 4 How to Prepare the ...
in a variety of 3D software packages Learn Project Management in Unity Understand how to set up a complex facial rig for speech Set up Animation Controllers with masked states and blend trees to ...
Now revised to reflect the innovations of Java 5.0, Goodrich and Tamassia’s Fourth Edition of Data Structures and Algorithms in Java continues to offer accessible coverage of fundamental data ...
本书以初学者为起点,循序渐进地介绍了面向对象的Java编程语言,系统地讨论了Java的基本概念和编程技术。全书共分为18章,首先从基本的Java理论开始,讲解了Java的基本数据类型和控制结构,Java中的方法、数组和字符...
Learn to use the Java Persistence API (JPA) and other related APIs as found in the Java EE 8 platform from the perspective of one of the specification creators. A one-of-a-kind resource, this in-depth...
《Java How to Program》第九版不仅仅是一本教授Java编程技巧的书籍,它还将编程技能与解决社会问题相结合。通过这本书,读者可以学习如何使用Java来编写与上述社会问题相关的应用程序。这种结合使得学生不仅能够...
This quick tutorial shows you how to use it and set it up in UltraEdit/UEStudio Working with Unicode in UltraEdit/UEStudio In this tutorial, we'll cover some of the basics of Unicode-encoded text and...