`

Java序列化算法实现和说明

    博客分类:
  • Java
阅读更多

Serialization is the process of saving an object's state to a sequence of bytes; deserialization is the process of rebuilding those bytes into a live object. The Java Serialization API provides a standard mechanism for developers to handle object serialization. In this tip, you will see how to serialize an object, and why serialization is sometimes necessary. You'll learn about the serialization algorithm used in Java, and see an example that illustrates the serialized format of an object. By the time you're done, you should have a solid knowledge of how the serialization algorithm works and what entities are serialized as part of the object at a low level.

Why is serialization required?

In today's world, a typical enterprise application will have multiple components and will be distributed across various systems and networks. In Java, everything is represented as objects; if two Java components want to communicate with each other, there needs be a mechanism to exchange data. One way to achieve this is to define your own protocol and transfer an object. This means that the receiving end must know the protocol used by the sender to re-create the object, which would make it very difficult to talk to third-party components. Hence, there needs to be a generic and efficient protocol to transfer the object between components. Serialization is defined for this purpose, and Java components use this protocol to transfer objects.

Figure 1 shows a high-level view of client/server communication, where an object is transferred from the client to the server through serialization.

 



 

Figure 1. A high-level view of serialization in action (click to enlarge)

How to serialize an object

In order to serialize an object, you need to ensure that the class of the object implements the java.io.Serializable interface, as shown in Listing 1.

Listing 1. Implementing Serializable

import java.io.Serializable;
class TestSerial implements Serializable {
	public byte version = 100;
	public byte count = 0;
}

 

 

In Listing 1, the only thing you had to do differently from creating a normal class is implement the java.io.Serializable interface. The Serializable interface is a marker interface; it declares no methods at all. It tells the serialization mechanism that the class can be serialized.

Now that you have made the class eligible for serialization, the next step is to actually serialize the object. That is done by calling the writeObject() method of the java.io.ObjectOutputStream class, as shown in Listing 2.

Listing 2. Calling writeObject()

public static void main(String args[]) throws IOException {
	FileOutputStream fos = new FileOutputStream("temp.out");
	ObjectOutputStream oos = new ObjectOutputStream(fos);	
	TestSerial ts = new TestSerial();	
	oos.writeObject(ts);	
	oos.flush();	
	oos.close();
}

 

 

Listing 2 stores the state of the TestSerial object in a file called temp.out. oos.writeObject(ts); actually kicks off the serialization algorithm, which in turn writes the object to temp.out.

To re-create the object from the persistent file, you would employ the code in Listing 3.

Listing 3. Recreating a serialized object

public static void main(String args[]) throws IOException {
	FileInputStream fis = new FileInputStream("temp.out");	
	ObjectInputStream oin = new ObjectInputStream(fis);	
	TestSerial ts = (TestSerial) oin.readObject();	
	System.out.println("version="+ts.version);
}

 

 

In Listing 3, the object's restoration occurs with the oin.readObject() method call. This method call reads in the raw bytes that we previously persisted and creates a live object that is an exact replica of the original object graph. Because readObject() can read any serializable object, a cast to the correct type is required.

Executing this code will print version=100 on the standard output.

The serialized format of an object

What does the serialized version of the object look like? Remember, the sample code in the previous section saved the serialized version of the TestSerial object into the file temp.out. Listing 4 shows the contents of temp.out, displayed in hexadecimal. (You need a hexadecimal editor to see the output in hexadecimal format.)

Listing 4. Hexadecimal form of TestSerial

 

AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 A0 0C 34 00 FE B1 DD F9 02 00 02 42 00 05
63 6F 75 6E 74 42 00 07 76 65 72 73 69 6F 6E 78
70 00 64

 

If you look again at the actual TestSerial object, you'll see that it has only two byte members, as shown in Listing 5.

Listing 5. TestSerial's byte members

	public byte version = 100;	
	public byte count = 0;

 

The size of a byte variable is one byte, and hence the total size of the object (without the header) is two bytes. But if you look at the size of the serialized object in Listing 4, you'll see 51 bytes. Surprise! Where did the extra bytes come from, and what is their significance? They are introduced by the serialization algorithm, and are required in order to to re-create the object. In the next section, you'll explore this algorithm in detail.

Java's serialization algorithm

By now, you should have a pretty good knowledge of how to serialize an object. But how does the process work under the hood? In general the serialization algorithm does the following:

  • It writes out the metadata of the class associated with an instance.
  • It recursively writes out the description of the superclass until it finds java.lang.object.
  • Once it finishes writing the metadata information, it then starts with the actual data associated with the instance. But this time, it starts from the topmost superclass.
  • It recursively writes the data associated with the instance, starting from the least superclass to the most-derived class.

I've written a different example object for this section that will cover all possible cases. The new sample object to be serialized is shown in Listing 6.

Listing 6. Sample serialized object

class parent implements Serializable {	
   int parentVersion = 10;
}

class contain implements Serializable{
	int containVersion = 11;
}

public class SerialTest extends parent implements Serializable {
	int version = 66;	contain con = new contain();
	public int getVersion() {
			return version;	
  }	
  public static void main(String args[]) throws IOException {
  		FileOutputStream fos = new FileOutputStream("temp.out");		
  		ObjectOutputStream oos = new ObjectOutputStream(fos);		
  		SerialTest st = new SerialTest();		
  		oos.writeObject(st);		
  		oos.flush();		
  		oos.close();	
  }
}

 

 

This example is a straightforward one. It serializes an object of type SerialTest, which is derived from parent and has a container object, contain. The serialized format of this object is shown in Listing 7.

Listing 7. Serialized form of sample object

AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 05 52 81 5A AC 66 02 F6 02 00 02 49 00 07
76 65 72 73 69 6F 6E 4C 00 03 63 6F 6E 74 00 09
4C 63 6F 6E 74 61 69 6E 3B 78 72 00 06 70 61 72
65 6E 74 0E DB D2 BD 85 EE 63 7A 02 00 01 49 00
0D 70 61 72 65 6E 74 56 65 72 73 69 6F 6E 78 70
00 00 00 0A 00 00 00 42 73 72 00 07 63 6F 6E 74
61 69 6E FC BB E6 0E FB CB 60 C7 02 00 01 49 00
0E 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E 78
70 00 00 00 0B

 

 

Figure 2 offers a high-level look at the serialization algorithm for this scenario.

 

 



 

Figure 2. An outline of the serialization algorithm

Let's go through the serialized format of the object in detail and see what each byte represents. Begin with the serialization protocol information:

  • AC ED: STREAM_MAGIC. Specifies that this is a serialization protocol.
  • 00 05: STREAM_VERSION. The serialization version.
  • 0x73: TC_OBJECT. Specifies that this is a new Object.

The first step of the serialization algorithm is to write the description of the class associated with an instance. The example serializes an object of type SerialTest, so the algorithm starts by writing the description of the SerialTest class.

  • 0x72: TC_CLASSDESC. Specifies that this is a new class.
  • 00 0A: Length of the class name.
  • 53 65 72 69 61 6c 54 65 73 74: SerialTest, the name of the class.
  • 05 52 81 5A AC 66 02 F6: SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This particular flag says that the object supports serialization.
  • 00 02: Number of fields in this class.

Next, the algorithm writes the field int version = 66;.

  • 0x49: Field type code. 49 represents "I", which stands for Int.
  • 00 07: Length of the field name.
  • 76 65 72 73 69 6F 6E: version, the name of the field.

And then the algorithm writes the next field, contain con = new contain();. This is an object, so it will write the canonical JVM signature of this field.

  • 0x74: TC_STRING. Represents a new string.
  • 00 09: Length of the string.
  • 4C 63 6F 6E 74 61 69 6E 3B: Lcontain;, the canonical JVM signature.
  • 0x78: TC_ENDBLOCKDATA, the end of the optional block data for an object.

The next step of the algorithm is to write the description of the parent class, which is the immediate superclass of SerialTest.

  • 0x72: TC_CLASSDESC. Specifies that this is a new class.
  • 00 06: Length of the class name.
  • 70 61 72 65 6E 74: SerialTest, the name of the class
  • 0E DB D2 BD 85 EE 63 7A: SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This flag notes that the object supports serialization.
  • 00 01: Number of fields in this class.

Now the algorithm will write the field description for the parent class. parent has one field, int parentVersion = 100;.

  • 0x49: Field type code. 49 represents "I", which stands for Int.
  • 00 0D: Length of the field name.
  • 70 61 72 65 6E 74 56 65 72 73 69 6F 6E: parentVersion, the name of the field.
  • 0x78: TC_ENDBLOCKDATA, the end of block data for this object.
  • 0x70: TC_NULL, which represents the fact that there are no more superclasses because we have reached the top of the class hierarchy.

So far, the serialization algorithm has written the description of the class associated with the instance and all its superclasses. Next, it will write the actual data associated with the instance. It writes the parent class members first:

  • 00 00 00 0A: 10, the value of parentVersion.

Then it moves on to SerialTest.

  • 00 00 00 42: 66, the value of version.

The next few bytes are interesting. The algorithm needs to write the information about the contain object, shown in Listing 8.

Listing 8. The contain object

contain con = new contain();

 

 

Remember, the serialization algorithm hasn't written the class description for the contain class yet. This is the opportunity to write this description.

  • 0x73: TC_OBJECT, designating a new object.
  • 0x72: TC_CLASSDESC.
  • 00 07: Length of the class name.
  • 63 6F 6E 74 61 69 6E: contain, the name of the class.
  • FC BB E6 0E FB CB 60 C7: SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This flag indicates that this class supports serialization.
  • 00 01: Number of fields in this class.

Next, the algorithm must write the description for contain's only field, int containVersion = 11;.

  • 0x49: Field type code. 49 represents "I", which stands for Int.
  • 00 0E: Length of the field name.
  • 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E: containVersion, the name of the field.
  • 0x78: TC_ENDBLOCKDATA.

Next, the serialization algorithm checks to see if contain has any parent classes. If it did, the algorithm would start writing that class; but in this case there is no superclass for contain, so the algorithm writes TC_NULL.

  • 0x70: TC_NULL.

Finally, the algorithm writes the actual data associated with contain.

  • 00 00 00 0B: 11, the value of containVersion.

Conclusion

In this tip, you have seen how to serialize an object, and learned how the serialization algorithm works in detail. I hope this article gives you more detail on what happens when you actually serialize an object.

About the author

Sathiskumar Palaniappan has more than four years of experience in the IT industry, and has been working with Java-related technologies for more than three years. Currently, he is working as a system software engineer at the Java Technology Center, IBM Labs. He also has experience in the telecom industry.

Resources

 

SOURCE URL : http://www.javaworld.com/community/node/2915

  • 大小: 24.9 KB
  • 大小: 18 KB
分享到:
评论
3 楼 greatwqs 2011-06-13  
Technoboy 写道
打眼一看,还以为你是作者呢,看完了才知道....

好的文章就收藏一下  
2 楼 Technoboy 2011-06-13  
打眼一看,还以为你是作者呢,看完了才知道....
1 楼 Technoboy 2011-06-13  
written by yourself or ? just foundation but senior?
There maybe some more senior subject to explore

相关推荐

    java序列化原理与算法

    #### Java序列化算法透析 Java序列化的核心是将对象转换为字节流,这涉及到一系列复杂的操作步骤。 ##### 序列化算法示例 为了更好地理解序列化的过程,我们可以通过一个简单的例子来进行说明。 ```java import ...

    时间序列算法java实现

    时间序列算法是统计学和数据...通过熟悉上述算法和利用Java库,你可以构建出高效且准确的预测模型。不过,要注意,任何预测模型都有其局限性,实际应用时应结合业务知识和领域专家的见解,以提高预测的准确性和实用性。

    Protocol Buffer序列化对比Java序列化.

    - Protocol Buffer:PB在序列化和反序列化时通常比Java序列化更快,因为它的数据格式更加紧凑,且解析算法优化。 - Java序列化:Java序列化虽然方便,但生成的数据量较大,且序列化和反序列化速度相对较慢。 2. ...

    Java序列化的机制和原理

    让我们深入探讨一下Java序列化的机制和原理。 首先,Java序列化的主要目的是为了对象的持久化和在网络上的传输。为了使一个对象可以被序列化,该对象的类必须实现`java.io.Serializable`接口。这个接口没有任何方法...

    json序列化与反序列化处理代码(java版本)

    极好的序列化与发序列化代码。可以处理array集合,数组或者单个对象等的序列化与反序列化。

    FST:快速Java序列化的替代品

    1. **高性能**:FST通过优化的序列化算法和字节码处理,实现了比Java默认序列化快几倍的速度。这在处理大量数据或者高并发场景下,能显著提升系统性能。 2. **小内存开销**:FST在序列化过程中能够减少无用的元数据...

    祖冲之密码算法Java实现

    5. **文档阅读能力**:祖冲之密码的文档说明可能包含算法的具体步骤、参数解释和示例,需要能读懂并正确地将其转化为代码。 6. **安全性考虑**:了解安全编程原则,确保在实现过程中避免常见的安全漏洞,如缓冲区...

    论文研究-一个基于JSON的对象序列化算法.pdf

    提出了一种基于JSON的对象序列化算法,该算法通过分析JSON文法并建立对象导航图,透明地将Java对象序列化成JSON表达式,使客户端能够很好地利用JavaScript引擎来解析JSON响应,有效地解决了解析XML所造成的缺陷。

    JAVA图形化界面银行家算法

    下面将详细阐述这个算法以及如何通过JAVA图形化界面来实现它。 首先,银行家算法的核心思想是预防性策略,其目标是在资源分配过程中确保系统不会进入不安全状态。在操作系统中,进程可能会请求不同的资源,如果这些...

    加快Java的文件序列化速度

    综上所述,通过合理优化对象结构、选择高效的序列化库、并行处理、预编译序列化代码以及调整硬件和软件配置,可以显著提升Java文件序列化的速度。实际应用中,需要根据具体场景选择合适的方法,进行性能测试,以找到...

    Java SE编程入门教程 java序列化(共14页).pptx

    Java SE编程入门教程 java序列化(共14页).pptx Java SE编程入门教程 java异常(共57页).pptx Java SE编程入门教程 java正则(共8页).pptx Java SE编程入门教程 properties(共3页).pptx Java SE编程入门教程 ...

    各种算法 java和c语言两种实现

    在算法实现上,Java的面向对象特性使得数据结构和算法的设计更加模块化,易于维护。例如,你可以看到Java代码中如何使用ArrayList或LinkedList实现动态数组和链表,或者如何利用Sort接口和Collections类来实现各种...

    Java反序列化实战.pdf

    ### Java反序列化实战知识点详解 #### 一、反序列化概述 - **定义**:在计算机科学领域,反序列化是指将字节流或文本流等数据转换回其原始对象结构的过程。这一过程通常与序列化相对应,序列化是将对象的状态转化...

    JAVA序列化和反序列化的底层实现原理解析

    JAVA序列化和反序列化的底层实现原理解析 一、基本概念 JAVA序列化是指把Java对象转换为字节序列的过程,而Java反序列化是指把字节序列恢复为Java对象的过程。序列化是把对象转换成有序字节流,以便在网络上传输...

    BP.rar_BP_bp 神经网络 java 算法_bp 预测_java BP预测算法_预测算法java

    总的来说,这个BP神经网络Java实现为理解和实践预测算法提供了一个基础平台,对于学习和研究神经网络算法的开发者来说,是一个有价值的资源。通过深入研究源代码,我们可以更深入地理解BP算法的工作原理,以及如何在...

    学生管理系统(序列化和反序列化)

    优化序列化算法和存储格式,比如使用高效的序列化库,可以提高系统的性能。 6. **安全性**:序列化也带来了一些安全风险,因为任何能够访问序列化文件的人都可能试图反序列化并篡改数据。在实际应用中,应确保对...

    Java 序列化的秘密(高清PDF中文版)

    ### Java序列化的秘密 #### 为什么需要序列化 在探讨序列化之前,我们需要理解序列化产生的背景及其必要性。简而言之,Java程序本质上是由进程和内存构成的系统,在这个系统中,进程根据定义好的类生成一系列实例...

    Java实现银行家算法和作业调度算法.zip

    通过阅读和分析这些文件,你可以深入理解银行家算法和作业调度算法的Java实现细节,以及如何在实际项目中应用这些理论知识。 总结来说,这个Java项目是学习和实践资源管理和调度策略的一个好例子,它涵盖了操作系统...

    tomasulo算法java实现

    `tom.pdf`可能是一个关于Tomasulo算法的详细文档或教程,而`order.txt`可能包含了一组特定的指令序列,用于测试和演示算法的运行。 Tomasulo算法的关键点包括: 1. **资源分配**:算法首先识别出可以并行执行的...

Global site tag (gtag.js) - Google Analytics