Java 虚拟机(三)G1
G1
The Garbage-First (G1) collector is a server-style garbage collector, targeted for multi-processor machines with large memories. It meets garbage collection (GC) pause time goals with a high probability, while achieving high throughput.
为了代替cms,能够满足gc停顿时间的同时保持高可用、高吞吐。
When performing garbage collections, G1 operates in a manner similar to the CMS collector. G1 performs a concurrent global marking phase to determine the liveness of objects throughout the heap. After the mark phase completes, G1 knows which regions are mostly empty. It collects in these regions first, which usually yields a large amount of free space. This is why this method of garbage collection is called Garbage-First. As the name suggests, G1 concentrates its collection and compaction activity on the areas of the heap that are likely to be full of reclaimable objects, that is, garbage. G1 uses a pause prediction model to meet a user-defined pause time target and selects the number of regions to collect based on the specified pause time target.
G1是的名字是garbage first,使用了pause prediction model,把需要回收的region进行个优先级的排队,然后根据模型和用户设置的时间(默认200ms)进行部分收集。
Compact free space without lengthy GC induced pause times.
G1是能够在不延长GC时间前提下进行内存整理,与CMS比,非常厉害的一点,CMS如果想整理内存需要去刻意一次去做,无疑会延长GC时间。
The regions identified by G1 as ripe for reclamation are garbage collected using evacuation. G1 copies objects from one or more regions of the heap to a single region on the heap, and in the process both compacts and frees up memory. This evacuation is performed in parallel on multi-processors, to decrease pause times and increase throughput. Thus, with each garbage collection, G1 continuously works to reduce fragmentation, working within the user defined pause times. This is beyond the capability of both the previous methods. CMS (Concurrent Mark Sweep ) garbage collector does not do compaction. ParallelOld garbage collection performs only whole-heap compaction, which results in considerable pause times.
G1 copies objects from one or more regions of the heap to a single region on the heap, and in the process both compacts and frees up memory. 这难道是region设计的最大原因?
由于有多个region,这样可以使用多个线程并发的整理,而且整理的时候是从多个region向一个region进行整理,这样做能做到并发互不影响,一边整理,一边释放内存。
It is important to note that G1 is not a real-time collector. It meets the set pause time target with high probability but not absolute certainty. Based on data from previous collections, G1 does an estimate of how many regions can be collected within the user specified target time. Thus, the collector has a reasonably accurate model of the cost of collecting the regions, and it uses this model to determine which and how many regions to collect while staying within the pause time target.
通过模型预测来决定回收那些regions。
Note: G1 has both concurrent (runs along with application threads, e.g., refinement, marking, cleanup) and parallel (multi-threaded, e.g., stop the world) phases. Full garbage collections are still single threaded, but if tuned properly your applications should avoid full GCs.
应该avoid full GCs。
G1的主要特点总结下:
- 通过多个Region的分区与单region比,更能并发的处理
- region回收的过程能完成整理,不用刻意去整理内存
- 通过更频繁、回收部分的,回收效率最高的老年代region方式,避免一次回收老年代造成的长时间停顿
- 缺点么,可能有部分内存浪费,更消耗cpu
Young GC
In summary, the following can be said about the young generation in G1:
- The heap is a single memory space split into regions.
- Young generation memory is composed of a set of non-contiguous regions. This makes it easy to resize when needed.
- Young generation garbage collections, or young GCs, are stop the world events. All application threads are stopped for the operation.
- The young GC is done in parallel using multiple threads.
- Live objects are copied to new survivor or old generation regions.
heap是一个大整块的分成很多region,不是预先划分好的。而且由于是不连续的regions,还能动态调整大小?什么操作。其他好像没啥特殊,多线程,stw,复制算法。
Old Generation Collection with G1(mixed GC,Old GC都有young GC过程)
Initial Marking Phase
Initial marking of live object is piggybacked on a young generation garbage collection.
Concurrent Marking Phase
If empty regions are found (as denoted by the “X”), they are removed immediately in the Remark phase. Also, “accounting” information that determines liveness is calculated.
这里,并发标记的时候就会标记empty region。同时会计算一些表示region是否需要回收的信息。
Remark Phase
Empty regions are removed and reclaimed. Region liveness is now calculated for all regions.
并发标记未完成的,全部完成。
Copying/Cleanup Phase
G1 selects the regions with the lowest “liveness”, those regions which can be collected the fastest. Then those regions are collected at the same time as a young GC. This is denoted in the logs as [GC pause (mixed)]. So both young and old generations are collected at the same time.
与yong GC一起mixed回收,合并一次操作。类似迭代时的,一次回收一点老年代的空间。
老年代回收总结
In summary, there are a few key points we can make about the G1 garbage collection on the old generation.
Concurrent Marking Phase
- Liveness information is calculated concurrently while the application is running.
- This liveness information identifies which regions will be best to reclaim during an evacuation pause.
- There is no sweeping phase like in CMS.
Remark Phase - Uses the Snapshot-at-the-Beginning (SATB) algorithm which is much faster then what was used with CMS.
- Completely empty regions are reclaimed.
Copying/Cleanup Phase - Young generation and old generation are reclaimed at the same time.
- Old generation regions are selected based on their liveness.
日志分析
通过上面的分析,G1也是会对内存进行分代管理,这样能提高收集的效率,区分出来回收概率低的老年代。G1收集过程主要是对Young GC、Old Generation Collection的收集。
下面日志是一个线上服务的日志,没有设置堆的大小,只配置了收集时间50ms。
2020-07-09T16:56:30.404+0800: 355.997: [GC pause (G1 Evacuation Pause) (young), 0.0151629 secs]
[Parallel Time: 10.4 ms, GC Workers: 18]
[GC Worker Start (ms): Min: 355999.1, Avg: 355999.2, Max: 355999.3, Diff: 0.2]
[Ext Root Scanning (ms): Min: 2.7, Avg: 3.0, Max: 4.4, Diff: 1.6, Sum: 53.5]
[Update RS (ms): Min: 3.0, Avg: 4.3, Max: 4.8, Diff: 1.8, Sum: 77.8]
[Processed Buffers: Min: 36, Avg: 65.7, Max: 85, Diff: 49, Sum: 1183]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.9]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
[Object Copy (ms): Min: 2.4, Avg: 2.7, Max: 2.8, Diff: 0.4, Sum: 48.9]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Termination Attempts: Min: 1, Avg: 1.1, Max: 2, Diff: 1, Sum: 19]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.5]
[GC Worker Total (ms): Min: 10.0, Avg: 10.2, Max: 10.3, Diff: 0.3, Sum: 182.8]
[GC Worker End (ms): Min: 356009.3, Avg: 356009.3, Max: 356009.4, Diff: 0.1]
[Code Root Fixup: 0.0 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.3 ms]
[Other: 4.5 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.5 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.3 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.3 ms]
[Eden: 904.0M(904.0M)->0.0B(908.0M) Survivors: 20.0M->16.0M Heap: 1706.7M(2016.0M)->801.6M(2016.0M)]
[Times: user=0.16 sys=0.00, real=0.02 secs]
这是一个young GC,这里Heap的大小为Heap: 1706.7M(2016.0M) 2016M,但只用了1706.7M。共花费real=0.02 secs这么长时间。
2020-07-09T16:57:03.018+0800: 388.612: [GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.0289543 secs]
[Parallel Time: 23.7 ms, GC Workers: 18]
[GC Worker Start (ms): Min: 388614.1, Avg: 388614.2, Max: 388614.3, Diff: 0.2]
[Ext Root Scanning (ms): Min: 5.5, Avg: 5.6, Max: 6.0, Diff: 0.4, Sum: 101.5]
[Update RS (ms): Min: 6.3, Avg: 6.4, Max: 6.7, Diff: 0.3, Sum: 115.9]
[Processed Buffers: Min: 38, Avg: 62.7, Max: 85, Diff: 47, Sum: 1129]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.7]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Object Copy (ms): Min: 10.4, Avg: 10.7, Max: 10.8, Diff: 0.4, Sum: 192.9]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.3]
[Termination Attempts: Min: 1, Avg: 7.3, Max: 14, Diff: 13, Sum: 132]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.8]
[GC Worker Total (ms): Min: 22.8, Avg: 23.0, Max: 23.1, Diff: 0.3, Sum: 413.3]
[GC Worker End (ms): Min: 388637.1, Avg: 388637.2, Max: 388637.3, Diff: 0.2]
[Code Root Fixup: 0.0 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.3 ms]
[Other: 4.9 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.6 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.3 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.3 ms]
[Eden: 760.0M(760.0M)->0.0B(760.0M) Survivors: 72.0M->72.0M Heap: 1642.3M(2016.0M)->885.4M(2016.0M)]
[Times: user=0.36 sys=0.00, real=0.03 secs]
(1) Initial Mark (Stop the World Event)
This is a stop the world event. With G1, it is piggybacked on a normal young GC. Mark survivor regions (root regions) which may have references to objects in old generation.
初始标记,为了充分利用停顿的时间,在一次young GC后,没有新生代,从survivor区开始标记。
2020-07-09T16:57:03.048+0800: 388.642: [GC concurrent-root-region-scan-start]
2020-07-09T16:57:03.067+0800: 388.661: [GC concurrent-root-region-scan-end, 0.0197043 secs]
(2) Root Region Scanning
Scan survivor regions for references into the old generation. This happens while the application continues to run. The phase must be completed before a young GC can occur.
与应用程序一起执行,找到survivor中引用老年代的引用。
2020-07-09T16:57:03.068+0800: 388.661: [GC concurrent-mark-start]
2020-07-09T16:57:03.233+0800: 388.827: [GC concurrent-mark-end, 0.1659108 secs]
(3) Concurrent Marking
Find live objects over the entire heap. This happens while the application is running. This phase can be interrupted by young generation garbage collections.
2020-07-09T16:57:03.240+0800: 388.834: [GC remark 2020-07-09T16:57:03.240+0800: 388.834: [Finalize Marking, 0.0014125 secs] 2020-07-09T16:57:03.242+0800: 388.835: [GC ref-proc, 0.0299033 secs] 2020-07-09T16:57:03.272+0800: 388.865: [Unloading, 0.0284857 secs], 0.0613861 secs]
###(4) Remark (Stop the World Event) Completes the marking of live object in the heap. Uses an algorithm called snapshot-at-the-beginning (SATB) which is much faster than what was used in the CMS collector. Empty regions are removed and reclaimed. Region liveness is now calculated for all regions.
这里最终计算除了所有region的存活系数。
[Times: user=0.43 sys=0.01, real=0.07 secs]
2020-07-09T16:57:03.309+0800: 388.903: [GC cleanup 953M->881M(2016M), 0.0026504 secs]
[Times: user=0.02 sys=0.00, real=0.01 secs]
2020-07-09T16:57:03.312+0800: 388.906: [GC concurrent-cleanup-start]
2020-07-09T16:57:03.312+0800: 388.906: [GC concurrent-cleanup-end, 0.0000895 secs]
(5) Cleanup (Stop the World Event and Concurrent)
- Performs accounting on live objects and completely free regions. (Stop the world)
- Scrubs the Remembered Sets. (Stop the world)
- Reset the empty regions and return them to the free list. (Concurrent)
2020-07-09T16:57:05.418+0800: 391.012: [GC pause (G1 Evacuation Pause) (mixed), 0.0242442 secs]
[Parallel Time: 20.4 ms, GC Workers: 18]
use (G1 Evacuation Pause) (young), 0.0233693 secs]
[Parallel Time: 18.7 ms, GC Workers: 18]
[GC Worker Start (ms): Min: 390942.4, Avg: 390942.6, Max: 390942.7, Diff: 0.3]
[Ext Root Scanning (ms): Min: 3.4, Avg: 3.8, Max: 6.2, Diff: 2.8, Sum: 67.7]
[Update RS (ms): Min: 1.0, Avg: 3.3, Max: 3.9, Diff: 2.9, Sum: 59.7]
[Processed Buffers: Min: 26, Avg: 58.9, Max: 94, Diff: 68, Sum: 1061]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.7]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 9.7, Avg: 10.2, Max: 10.6, Diff: 0.9, Sum: 183.8]
[Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 5.9]
[Termination Attempts: Min: 1, Avg: 3.8, Max: 6, Diff: 5, Sum: 68]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.9]
[GC Worker Total (ms): Min: 17.5, Avg: 17.7, Max: 17.9, Diff: 0.3, Sum: 318.7]
[GC Worker End (ms): Min: 390960.3, Avg: 390960.3, Max: 390960.3, Diff: 0.1]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.3 ms]
[Other: 4.4 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.6 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.2 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[GC Worker Start (ms): Min: 391012.6, Avg: 391012.7, Max: 391012.8, Diff: 0.2]
[Ext Root Scanning (ms): Min: 3.0, Avg: 3.3, Max: 5.2, Diff: 2.2, Sum: 59.4]
[Update RS (ms): Min: 0.8, Avg: 2.3, Max: 2.8, Diff: 2.1, Sum: 42.1]
[Processed Buffers: Min: 1, Avg: 12.4, Max: 34, Diff: 33, Sum: 224]
[Scan RS (ms): Min: 0.4, Avg: 0.7, Max: 0.8, Diff: 0.4, Sum: 12.2]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Object Copy (ms): Min: 13.5, Avg: 13.7, Max: 13.7, Diff: 0.2, Sum: 245.7]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[Termination Attempts: Min: 1, Avg: 3.3, Max: 6, Diff: 5, Sum: 60]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.7]
[GC Worker Total (ms): Min: 19.9, Avg: 20.0, Max: 20.1, Diff: 0.2, Sum: 360.3]
[GC Worker End (ms): Min: 391032.7, Avg: 391032.7, Max: 391032.8, Diff: 0.1]
[Code Root Fixup: 0.0 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.3 ms]
[Other: 3.5 ms]
[Choose CSet: 0.1 ms]
[Ref Proc: 0.3 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.2 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.4 ms]
[Eden: 28.0M(28.0M)->0.0B(84.0M) Survivors: 72.0M->16.0M Heap: 839.9M(2016.0M)->663.2M(2016.0M)]
[Times: user=0.32 sys=0.01, real=0.03 secs]
(*) Copying (Stop the World Event)
These are the stop the world pauses to evacuate or copy live objects to new unused regions. This can be done with young generation regions which are logged as [GC pause (young)]. Or both young and old generation regions which are logged as [GC Pause (mixed)].
后面又连续出现了多次mix回收,可见在并发标记后,为了满足停顿时间,可能会有多次小的mix GC操作。
对于年轻代的收集的日志是比较容易看的,频率最高的:
- G1收集主要就是copy的过程。
- G1也不是完全保证收集时间的,年轻代收集也是的
- 年轻代收集主要过程是STW的
- 年轻代大小是会调整的,每次收集后可能都会调整
- 分配的堆大小,和实际使用的堆大小不一定是相等的,G1中可能堆分配或占用了20G(PS看是已经占用了的),但实际上使用了10个G
老年代的收集过程比较多,主要老年代收集是在一个叫混合收集的过程中进行收集的,这个过程的日志主要分为两部分,为什么不是向年轻代收集那样直接一起呢?
- 首先,老年代收集过程有5个步骤,其中混合收集只是其中的一个步骤。年轻代可以认为主要一个步骤就完成,虽然年轻代也很多步骤,但老年代这5个步骤职责、是否并发、是否STW都明显不同。
- 每个步骤是独立的,不是一个串行的过程,是否并行、是否STW都是不一样的
- 一些步骤(Concurrent Marking)还是能被年轻代收集给打断的,因为一些步骤不是STW的并且时间比较长,而年轻代是STW的并且频繁的,如果不能被打断,无法进行年轻代回收,可能无法满足用户收集时间。