HotSpot在PrintHeapAtGC输出的内容的格式

RednaxelaFX

浏览: 3063112 次
性别:
来自: 海外

最近访客更多访客>>

fangang

kknd97

peakmeng

wszt

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Java

Windows thread C++C#C

当使用 -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintHeapAtGC 参数启动HotSpot来运行Java程序时，可以看到类似下面这种GC log：

{Heap before GC invocations=0 (full 0):
 par new generation   total 14784K, used 13175K [0x03ad0000, 0x04ad0000, 0x04ad0000)
  eden space 13184K,  99% used [0x03ad0000, 0x047adcf8, 0x047b0000)
  from space 1600K,   0% used [0x047b0000, 0x047b0000, 0x04940000)
  to   space 1600K,   0% used [0x04940000, 0x04940000, 0x04ad0000)
 concurrent mark-sweep generation total 49152K, used 0K [0x04ad0000, 0x07ad0000, 0x09ad0000)
 concurrent-mark-sweep perm gen total 16384K, used 2068K [0x09ad0000, 0x0aad0000, 0x0dad0000)

好多名字好多数字。可是它们分别是什么意思呢？在源码里找找答案吧～

如果设置PrintHeapAtGC参数，则HotSpot在GC前后都会将GC堆的概要状况输出到log中。
在HotSpot源码中搜索“PrintHeapAtGC”，可以找到许多地方。其中形如“if (PrintHeapAtGC)”的就是该参数起作用的地方。这里挑genCollectedHeap为例来看看：
hotspot/src/share/vm/memory/genCollectedHeap.cpp

void GenCollectedHeap::do_collection(bool  full,
                                     bool   clear_all_soft_refs,
                                     size_t size,
                                     bool   is_tlab,
                                     int    max_level) {
  bool prepared_for_verification = false;
  ResourceMark rm;
  DEBUG_ONLY(Thread* my_thread = Thread::current();)

  assert(SafepointSynchronize::is_at_safepoint(), "should be at safepoint");
  assert(my_thread->is_VM_thread() ||
         my_thread->is_ConcurrentGC_thread(),
         "incorrect thread type capability");
  assert(Heap_lock->is_locked(), "the requesting thread should have the Heap_lock");
  guarantee(!is_gc_active(), "collection is not reentrant");
  assert(max_level < n_gens(), "sanity check");

  // ...

  if (PrintHeapAtGC) {
    Universe::print_heap_before_gc();
    if (Verbose) {
      gclog_or_tty->print_cr("GC Cause: %s", GCCause::to_string(gc_cause()));
    }
  }
  
  // perform GC...

  if (PrintHeapAtGC) {
    Universe::print_heap_after_gc();
  }
  
  // ...
}

// ...

void GenCollectedHeap::print_on(outputStream* st) const {
  for (int i = 0; i < _n_gens; i++) {
    _gens[i]->print_on(st);
  }
  perm_gen()->print_on(st);
}

类似的，另外几种GC堆负责执行回收的方法也会在回收前后分别调用Universe::print_heap_before_gc()与Universe::print_heap_after_gc()：

PSMarkSweep::invoke_no_policy()
PSScavenge::invoke_no_policy()
PSParallelCompact::pre_compact()
PSScavenge::invoke_no_policy()
G1CollectedHeap::do_collection()
G1CollectedHeap::do_collection_pause_at_safepoint()

那么这两个输出堆信息的函数是如何实现的呢？
hotspot/src/share/vm/memory/universe.hpp

  static void print_heap_before_gc() { print_heap_before_gc(gclog_or_tty); }
  static void print_heap_after_gc()  { print_heap_after_gc(gclog_or_tty); }

hotspot/src/share/vm/memory/universe.cpp

void Universe::print_heap_before_gc(outputStream* st) {
  st->print_cr("{Heap before GC invocations=%u (full %u):",
               heap()->total_collections(),
               heap()->total_full_collections());
  heap()->print_on(st);
}

void Universe::print_heap_after_gc(outputStream* st) {
  st->print_cr("Heap after GC invocations=%u (full %u):",
               heap()->total_collections(),
               heap()->total_full_collections());
  heap()->print_on(st);
  st->print_cr("}");
}

OK，可以看到大体骨架了。在invocations=后的数字表示的是总的GC次数，full后的数字则是其中full GC的次数。接下来就交给各个不同算法实现的GC堆来输出自身的信息了。

留意到本例中启动JVM时用了-XX:+UseParNewGC -XX:+UseConcMarkSweepGC这两个参数。这指定了在年轻代使用parallel new收集器，在年老代使用concurrent-mark-sweep收集器。这种组合所使用的堆就是前面提到的GenCollectedHeap，本例中输出堆信息调用heap()->print_on(st)调用的就是GenCollectedHeap::print_on()，代码上面也贴出来了。其中每一代都被组织为一个Generation类的对象：
hotspot/src/share/vm/memory/generation.hpp

// A Generation models a heap area for similarly-aged objects.
// It will contain one ore more spaces holding the actual objects.
//
// The Generation class hierarchy:
//
// Generation                      - abstract base class
// - DefNewGeneration              - allocation area (copy collected)
//   - ParNewGeneration            - a DefNewGeneration that is collected by
//                                   several threads
// - CardGeneration                 - abstract class adding offset array behavior
//   - OneContigSpaceCardGeneration - abstract class holding a single
//                                    contiguous space with card marking
//     - TenuredGeneration         - tenured (old object) space (markSweepCompact)
//     - CompactingPermGenGen      - reflective object area (klasses, methods, symbols, ...)
//   - ConcurrentMarkSweepGeneration - Mostly Concurrent Mark Sweep Generation
//                                       (Detlefs-Printezis refinement of
//                                       Boehm-Demers-Schenker)
//
// The system configurations currently allowed are:
//
//   DefNewGeneration + TenuredGeneration + PermGeneration
//   DefNewGeneration + ConcurrentMarkSweepGeneration + ConcurrentMarkSweepPermGen
//
//   ParNewGeneration + TenuredGeneration + PermGeneration
//   ParNewGeneration + ConcurrentMarkSweepGeneration + ConcurrentMarkSweepPermGen
//

// ...

class Generation: public CHeapObj {
  // ...

  // Memory area reserved for generation
  VirtualSpace _virtual_space;

  // ...
  
 public:
  // The set of possible generation kinds.
  enum Name {
    ASParNew,
    ASConcurrentMarkSweep,
    DefNew,
    ParNew,
    MarkSweepCompact,
    ConcurrentMarkSweep,
    Other
  };
  
  // ...
  
  // Space enquiries (results in bytes)
  virtual size_t capacity() const = 0;  // The maximum number of object bytes the
                                        // generation can currently hold.
  virtual size_t used() const = 0;      // The number of used bytes in the gen.
  virtual size_t free() const = 0;      // The number of free bytes in the gen.

  // ...
};

（题外话：注意上面的注意提到这个分代式GC堆的框架下所允许的组合有哪些。
可以留意到ParallelGC与G1GC不在其中，难道它们不是分代式的？其实是，虽然它们也是分代式GC的实现（G1逻辑上是分代式的），但并没有使用HotSpot原先的框架，而是另外开了接口。这点在OpenJDK的一篇文档上有所描述：

引用

Collector Styles

There are two styles in which we've built collectors. At first we had a framework into which we could plug generations, each of which would have its own collector. The framework is general enough for us to have built several collectors on it, and does support a limited amount of mix-and-matching of “framework” generations. The framework has some inefficiencies due to the generality at allows. We've worked around some of the inefficiencies. When we built the high-throughput collector we decided not to use the framework, but instead designed an interface that a collector would support, with the high-throughput collector as an instance of that interface. That means that the “interface” collectors can't be mixed and matched, which implies some duplication of code. But it has the advantage that one can work on an “interface” collector without worrying about breaking any of the other collectors.

）
好吧，扯回来。看看Generation::print_on()是如何实现的：
hotspot/src/share/vm/memory/generation.cpp

void Generation::print_on(outputStream* st)  const {
  st->print(" %-20s", name());
  st->print(" total " SIZE_FORMAT "K, used " SIZE_FORMAT "K",
             capacity()/K, used()/K);
  st->print_cr(" [" INTPTR_FORMAT ", " INTPTR_FORMAT ", " INTPTR_FORMAT ")",
              _virtual_space.low_boundary(),
              _virtual_space.high(),
              _virtual_space.high_boundary());
}

可以看到每行上输出的是：

GC堆的名字 total 总容量 used 已使用空间 [数字1,数字2,数字3)

其中总容量与已使用空间都是以KB为单位的。
呃，“数字1”“数字2”“数字3”是怎么回事？可以看到它们是_virtual_space的属性，其声明为：

// VirtualSpace is data structure for committing a previously reserved address range in smaller chunks.

class VirtualSpace VALUE_OBJ_CLASS_SPEC {
  // ...
 private:
  // Reserved area
  char* _low_boundary;
  char* _high_boundary;

  // Committed area
  char* _low;
  char* _high;

  // ...

 public:
  // Committed area
  char* low()  const { return _low; }
  char* high() const { return _high; }

  // Reserved area
  char* low_boundary()  const { return _low_boundary; }
  char* high_boundary() const { return _high_boundary; }
  
  // ...
};

“reserved area”是指申请了但还没实际提交的空间，“commited area”是指申请了并已提交的空间。“reserved”与“commited”是在分阶段向操作系统申请空间时会涉及的概念，在Windows上的话，可以参考MSDN上VirtualAlloc()的文档。
至此可知本文开头GC log的后几行格式为：

GC堆的名字 total 总容量 used 已分配空间 [申请的虚拟空间下限,已分配的虚拟空间上限,申请的虚拟空间上限)

左方括号与右圆括号就是标准的区间记法，表示一个左闭右开的区间。
至于eden、from、to这三行的格式也是大同小异，就懒得深究了……

根据分析，可以很清楚的看到开头的GC log所表示的堆的年轻代：年老代：永久代的空间分配分别是16MB：80MB：64MB，比例是1：5：4。用图画出来，可以看到它们在虚拟内存中是被紧挨着分配的：

以上数据和代码基于JDK 1.6.0 update 18，在32位Windows XP SP3上。

查看图片附件

1
顶

0
踩

分享到：

gerMonkey与Carakan动态更新 ... | 同一个package的类型分散在不同JAR包中

2010-02-23 22:56
浏览 5799
评论(4)
分类:编程语言
查看更多

4 楼 RednaxelaFX 2011-11-04

hittyt 写道

{Heap before GC invocations=0 (full 0):
 par new generation   total 14784K, used 13175K [0x03ad0000, 0x04ad0000, 0x04ad0000)
  eden space 13184K,  99% used [0x03ad0000, 0x047adcf8, 0x047b0000)
  from space 1600K,   0% used [0x047b0000, 0x047b0000, 0x04940000)
  to   space 1600K,   0% used [0x04940000, 0x04940000, 0x04ad0000)
 concurrent mark-sweep generation total 49152K, used 0K [0x04ad0000, 0x07ad0000, 0x09ad0000)
 concurrent-mark-sweep perm gen total 16384K, used 2068K [0x09ad0000, 0x0aad0000, 0x0dad0000)

博主能不能明确的解释一下是怎么从这段log得到文末的图的呢？
通过eden加上两个survivor得到：(13184+1600+1600)/1024=16M，这个可以理解。可是，log中的par new generation total 14784K又是什么含义呢？这个和16M是对不上的。
文末的图中的年老带（80M）、永久带（64M）的容量又是怎么从log中推算出来的呢？看log中的49152K和16384K都是对不上这些数字的。

Sure。这些分代的大小都是用后面的地址来确定的。
Young Generation的范围是[0x03ad0000, 0x04ad0000)，就是16M；相应的，Old Generation的范围是[0x04ad0000, 0x09ad0000)，就是80M，PermGen依此类推。

par new generation   total 14784K

这个total后面的是“current capacity”，简单来说就是当前已经commit了的内存的大小。current capacity是可以在min capacity与max capacity之间浮动的，max就是分代的整体大小，也就是上面说的16M啊80M啊那些。

3 楼 hittyt 2011-11-04

{Heap before GC invocations=0 (full 0):
 par new generation   total 14784K, used 13175K [0x03ad0000, 0x04ad0000, 0x04ad0000)
  eden space 13184K,  99% used [0x03ad0000, 0x047adcf8, 0x047b0000)
  from space 1600K,   0% used [0x047b0000, 0x047b0000, 0x04940000)
  to   space 1600K,   0% used [0x04940000, 0x04940000, 0x04ad0000)
 concurrent mark-sweep generation total 49152K, used 0K [0x04ad0000, 0x07ad0000, 0x09ad0000)
 concurrent-mark-sweep perm gen total 16384K, used 2068K [0x09ad0000, 0x0aad0000, 0x0dad0000)

2 楼 RednaxelaFX 2011-02-13

IcyFenix 写道

撒迦能否写一篇在Windows平台下使用Visual C++来建立HotSpot跟踪调试环境的文章呢？

如果是用Visual Studio 2010在Windows上构建OpenJDK 7的话本身应该没啥问题吧？
我现在公司的开发机上没装VS2010，自己的本上只装了Ubuntu…抱歉暂时没环境

1 楼 IcyFenix 2011-02-13

鼓掌。

OpenJDK网站上有在Linux上建立NetBeans的HotSpot开发环境的文章。

撒迦能否写一篇在Windows平台下使用Visual C++来建立HotSpot跟踪调试环境的文章呢？

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论