`
pcpig
  • 浏览: 92221 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

Java Objects Memory Structure

 
阅读更多

 

原帖地址:http://www.codeinstructions.com/2008/12/java-objects-memory-structure.html
Update (December 18th, 2008): I've posted here an experimental library that implements Sizeof for Java.

One thing about Java that has always bothered me, given my C/C++ roots, is the lack of a way to figure out how much memory is used by an object. C++ features the sizeof operator, that lets you query the size of primitive types and also the size of objects of a given class. This operator in C and C++ is useful for pointer arithmetic, copying memory around, and IO, for example.

Java doesn't have a corresponding operator. In reality, Java doesn't need one. Size of primitive types in Java is defined in the language specification, whereas in C and C++ it depends on the platform. Java has its own IO infrastructure built around serialization. And both pointer arithmetic and bulk memory copy don't apply because Java doesn't have pointers.

But every Java developer at some point wondered how much memory is used by a Java object. The answer, it turns out, is not so simple.

The first distinction to be made is between shallow size and deep size. The shallow size of an object is the space occupied by the object alone, not taking into account size of other objects that it references. The deep size, on the other hand, takes into account the shallow size of the object, plus the deep size of each object referenced by this object, recursively. Most of the times you will be interested on knowing the deep size of an object, but, in order to know that, you need to know how to calculate the shallow size first, which is what I'm going to talk about here.

One complication is that runtime in memory structure of Java objects is not enforced by the virtual machine specification, which means that virtual machine providers can implement them as they please. The consequence is that you can write a class, and instances of that class in one VM can occupy a different amount of memory than instances of that same class when run in another VM. Most of the world, including myself, uses the Sun HotSpot virtual machine though, which simplifies things a lot. The remainder of the discussion will focus on the 32 bit Sun JVM. I will lay down a few 'rules that will help explain how the JVM organizes the objects' layout in memory.

Memory layout of classes that have no instance attributes

In the Sun JVM, every object (except arrays) has a 2 words header. The first word contains the object's identity hash code plus some flags like lock state and age, and the second word contains a reference to the object's class. Also, any object is aligned to an 8 bytes granularity. This is the first rule or objects memory layout:

Rule 1: every object is aligned to an 8 bytes granularity.

Now we know that if we call new Object(), we will be using 8 bytes of the heap for the two header words and nothing else, since the Object class doesn't have any fields.

Memory layout of classes that extend Object

After the 8 bytes of header, the class attributes follow. Attributes are always aligned in memory to their size. For instance, ints are aligned to a 4 byte granularity, and longs are aligned to an 8 byte granularity. There is a performance reason to do it this way: usually the cost to read a 4 bytes word from memory into a 4 bytes register of the processor is much cheaper if the word is aligned to a 4 bytes granularity.

In order to save some memory, the Sun VM doesn't lay out object's attributes in the same order they are declared. Instead, the attributes are organized in memory in the following order:
  1. doubles and longs
  2. ints and floats
  3. shorts and chars
  4. booleans and bytes
  5. references

This scheme allows for a good optimization of memory usage. For example, imagine you declared the following class:
class MyClass {
    byte a;
    int c;
    boolean d;
    long e;
    Object f;        
}

If the JVM didn't reorder the attributes, the object memory layout would be like this:
[HEADER:  8 bytes]  8
[a:       1 byte ]  9
[padding: 3 bytes] 12
[c:       4 bytes] 16
[d:       1 byte ] 17
[padding: 7 bytes] 24
[e:       8 bytes] 32
[f:       4 bytes] 36
[padding: 4 bytes] 40

Notice that 14 bytes would have been wasted with padding and the object would use 40 bytes of memory. By reordering the objects using the rules above, the in memory structure of the object becomes:
[HEADER:  8 bytes]  8
[e:       8 bytes] 16
[c:       4 bytes] 20
[a:       1 byte ] 21
[d:       1 byte ] 22
[padding: 2 bytes] 24
[f:       4 bytes] 28
[padding: 4 bytes] 32

This time, only 6 bytes are used for padding and the object uses only 32 bytes of memory.

So here is rule 2 of object memory layout:

Rule 2: class attributes are ordered like this: first longs and doubles; then ints and floats; then chars and shorts; then bytes and booleans, and last the references. The attributes are aligned to their own granularity.

Now we know how to calculate the memory used by any instance of a class that extends Object directly. One practical example is the java.lang.Boolean class. Here is its memory layout:
[HEADER:  8 bytes]  8 
[value:   1 byte ]  9
[padding: 7 bytes] 16

An instance of the Boolean class takes 16 bytes of memory! Surprised? (Notice the padding at the end to align the object size to an 8 bytes granularity.)

Memory layout of subclasses of other classes

The next three rules are followed by the JVM to organize the the fields of classes that have superclasses. Rule 3 of object memory layout is the following:

Rule 3: Fields that belong to different classes of the hierarchy are NEVER mixed up together. Fields of the superclass come first, obeying rule 2, followed by the fields of the subclass.

Here is an example:
class A {
   long a;
   int b;
   int c;
}

class B extends A {
   long d;
}

An instance of B looks like this in memory:
[HEADER:  8 bytes]  8
[a:       8 bytes] 16
[b:       4 bytes] 20
[c:       4 bytes] 24
[d:       8 bytes] 32

The next rule is used when the fields of the superclass don't fit in a 4 bytes granularity. Here is what it says:

Rule 4: Between the last field of the superclass and the first field of the subclass there must be padding to align to a 4 bytes boundary.

Here is an example:
class A {
   byte a;
}

class B {
   byte b;
}
[HEADER:  8 bytes]  8
[a:       1 byte ]  9
[padding: 3 bytes] 12
[b:       1 byte ] 13
[padding: 3 bytes] 16

Notice the 3 bytes padding after field a to align b to a 4 bytes granularity. That space is lost and cannot be used by fields of class B.

The final rule is applied to save some space when the first field of the subclass is a long or double and the parent class doesn't end in an 8 bytes boundary.

Rule 5: When the first field of a subclass is a double or long and the superclass doesn't align to an 8 bytes boundary, JVM will break rule 2 and try to put an int, then shorts, then bytes, and then references at the beginning of the space reserved to the subclass until it fills the gap.

Here is an example:
class A {
  byte a;
}

class B {
  long b;
  short c;  
  byte d;
}

Here is the memory layout:
[HEADER:  8 bytes]  8
[a:       1 byte ]  9
[padding: 3 bytes] 12
[c:       2 bytes] 14
[d:       1 byte ] 15
[padding: 1 byte ] 16
[b:       8 bytes] 24

At byte 12, which is where class A 'ends', the JVM broke rule 2 and stuck a short and a byte before a long, to save 3 out of 4 bytes that would otherwise have been wasted.

Memory layout of arrays

Arrays have an extra header field that contain the value of the 'length' variable. The array elements follow, and the arrays, as any regular objects, are also aligned to an 8 bytes boundary.

Here is the layout of a byte array with 3 elements:
[HEADER:  12 bytes] 12
[[0]:      1 byte ] 13
[[1]:      1 byte ] 14
[[2]:      1 byte ] 15
[padding:  1 byte ] 16

And here is the layout of a long array with 3 elements:
[HEADER:  12 bytes] 12
[padding:  4 bytes] 16
[[0]:      8 bytes] 24
[[1]:      8 bytes] 32
[[2]:      8 bytes] 40

Memory layout of inner classes

Non-static inner classes have an extra 'hidden' field that holds a reference to the outer class. This field is a regular reference and it follows the rule of the in memory layout of references. Inner classes, for this reason, have an extra 4 bytes cost.

Final thoughts

We have learned how to calculate the shallow size of any Java object in the 32 bit Sun JVM. Knowing how memory is structured can help you understand how much memory is used by instances of your classes.

In the next post I will will show code that puts it all together and uses reflection to calculate the deep size of an object.Subscribe to my Feed or keep watching this blog for updates!

18 COMENTÁRIOS:

Anonymous said...

cool. thank you very much.

yukuku said...

Interesting! 
Reveals the mystery.
How about object locking? Where is the lock stored?

Domingos Neto said...

yukuku: the lock information is stored in 3 bits of the first work of the header. It shares that work with the the identity hash code and the age of the object which is for memory management purposes.

Hope it helps!

Domingos Neto said...

I meant "it shares that *word*" above :)

Eric Yung said...

Great Post.

Anonymous said...

Thanks for the post. Very informative.

Anonymous said...

can you suggest books or links to documentation, where you found this information. would like to read more about this, but most java books only talk about the language itself, and not specifics like this.

JAlexoid said...

You have a problem with the mem layout in long array, it's not 1 byte per element it's 8 bytes per element.

Domingos Neto said...

JAlexoid: thanks for the head up, I fixed the typo :)

Domingos Neto said...

Anonymous: I found most of this information by studying the OpenJDK source code and reading the documentation provided at http://openjdk.java.net/

Anonymous said...

Hello Domingos
Excellent article. I was searching these info for very long time. Thanks for sharing.
Another question:
What changes signifies that a particular thread has acquired the lock on an object and how does that relate to 3 bits of header word.

Does the acquiring thread changes the bit order of those 3 bits ? Please let me know, I would like to understand.

Thanks
//Kannan

Anonymous said...

Hi,

I have one question after reading this article. In following example :
[HEADER: 8 bytes] 8
[a: 1 byte ] 9
[padding: 3 bytes] 12
[c: 4 bytes] 16
[d: 1 byte ] 17
[padding: 7 bytes] 24
[e: 8 bytes] 32
[f: 4 bytes] 36
[padding: 4 bytes] 40

Why padding after "a" is 3 bytes, not 7 bytes? Why padding after "d" is 7 bytes, not 3 bytes?

Thanks

Domingos Neto said...

Hi anonymous, this is because a field should be aligned to a granularity that is equivalent to its size. This means that field c has to be aligned to a 4 bytes boundary, hence the 3 bytes padding preceding it. Field e, because it has 8 bytes, has to be aligned to an 8 bytes boundary and therefore has a 7 bytes padding before it.

Fuad said...

Great post Domingos. I was wondering, what's your source for this information? Is there any particular Sun JVM architecture reference or something?

Cheers,
/fuad

laxmi said...

hi,
I was searching these info for very long time. Thanks for sharing.

thanks again once.

Anonymous said...

I am confused. Who is the original author of this article. Look at the link below:

http://razanpokhrel.blogspot.com/2010/04/java-developer-on-their-programming.html#links

Rakesh said...

thanks a ton for the info !!! :)

Anonymous said...

Check out the following blog for simple explanation of JVM memory structure. It will not take more than 15 min. to assimilate the whole content.
Follow :

JVM memory allocation

 

分享到:
评论

相关推荐

    javacv-platform-1.3.3-src

    Navigate to File > Project Structure > app > Dependencies, click +, and select "2 File dependency". Select all the JAR files from the libs subdirectory. After that, the wrapper classes for OpenCV and ...

    Sammie Bae - JavaScript Data Structures and Algorithms - 2019.pdf

    7. JavaScript Memory Management 8. Recursion 9. Sets 10. Searching and Sorting 11. Hash Tables 12. Stacks and Queues 13. Linked Lists 14. Caching 15. Trees 16. Heaps 17. Graphs 18. Advanced Strings 19...

    python3.6.5参考手册 chm

    PEP 445: Customization of CPython Memory Allocators PEP 442: Safe Object Finalization PEP 456: Secure and Interchangeable Hash Algorithm PEP 436: Argument Clinic Other Build and C API Changes ...

    CE中文版-启点CE过NP中文.exe

    Mono now has some new features like instancing of objects Mono instances window is now a treeview where you can see the fields and values "find what addresses this code accesses" can also be used on ...

    Packt.Mastering.Csharp.and.NET.Programming

    - **Garbage Collector**: Automatically manages the memory used by programs, freeing up unused objects to prevent memory leaks. - **Concurrent vs. Parallel Computing**: - **Concurrent Computing**: ...

    Object-Oriented Software Construction 2nd

    Chapter 8: The run-time structure: objects 217 8.1 OBJECTS 218 8.2 OBJECTS AS A MODELING TOOL 228 8.3 MANIPULATING OBJECTS AND REFERENCES 231 8.4 CREATION PROCEDURES 236 8.5 MORE ON REFERENCES 240 8.6...

    Addison.Wesley.C++.by.Dissection.2002.pdf

    - **Operators new and delete:** Explains dynamic memory allocation. - **Vector Instead of Array:** Compares vectors with arrays. - **String Instead of char*:** Compares the `std::string` class with ...

    2021-2022计算机二级等级考试试题及答案No.17518.docx

    8. In Java, the `package` statement should be the first line in a source file, organizing classes and interfaces into a hierarchical structure. 9. The Image control in web development is used to ...

    Mac Programming for Absolute Beginners

    - **Memory Management:** Techniques for allocating and deallocating memory to avoid leaks. #### Cocoa Framework The Cocoa framework provides a rich set of APIs for building native Mac applications. ...

    Professional C# 3rd Edition

    Assembly Structure 344 Assembly Manifests 346 Namespaces, Assemblies, and Components 346 Private and Shared Assemblies 347 Viewing Assemblies 347 Building Assemblies 348 Cross-Language Support 353 The...

    PHP5 完整官方 中文教程

    Extension structure Memory management Working with variables Writing functions Working with classes and objects Working with resources Working with INI settings Working with streams PDO Driver How-To ...

    PHP5中文参考手册

    Extension structure Memory management Working with variables Writing functions Working with classes and objects Working with resources Working with INI settings Working with streams PDO Driver How-To ...

    PHP手册2007整合中文版

    46. Extension structure 47. Memory management 48. Working with variables 49. Writing functions 50. Working with classes and objects 51. Working with resources 52. Working with INI settings 53. Working...

    PHP官方手册中文版

    46. Extension structure 47. Memory management 48. Working with variables 49. Writing functions 50. Working with classes and objects 51. Working with resources 52. Working with INI settings 53. ...

    apktool documentation

    apks are nothing more than a zip file containing resources and compiled java. If you were to simply unzip an apk like so, you would be left with files such as classes.dex and resources.arsc. $ unzip...

    Google C++ Style Guide(Google C++编程规范)高清PDF

    Table of Contents Header Files The #define Guard Header File Dependencies Inline Functions The -inl.h Files Function Parameter Ordering Names and Order of Includes Scoping Namespaces Nested Classes ...

    ZendFramework中文文档

    1. Introduction to Zend Framework 1.1. 概述 1.2. 安装 2. Zend_Acl 2.1. 简介 2.1.1. 关于资源(Resource) 2.1.2. 关于角色(Role) 2.1.3. 创建访问控制列表(ACL) ...2.1.5. 定义访问控制 ...

Global site tag (gtag.js) - Google Analytics