, , , , , , , , ,

In my previous article on garbage collection , we have seen what happens when the JVM initiates garbage collection and how the memory is structured in heap. Now we look on the various algorithms used for garbage collection.

Java uses the below collectors with various choices in garbage collection
1. Serial Collector
2. Parallel Collector
3. Parallel Compacting Collector
4. Concurrent Mark-Sweep (CMS) Collector

First we see the below terminologies (synonym for each collector)which makes easier to understand the various collectors.

Serial turns for only one thing happens at a time. For example, even when multiple CPUs are available, only one is utilized to perform the collection.

It is opposite to serial. When parallel collection is used, the task of garbage collection is split into parts and those sub-parts are executed simultaneously, on different CPUs. This simultaneous operation makes the collection to be done more quickly. But it costs for fragmentation.

When the memory for garbage objects is freed, the free space may appear in small chunks in various areas such that there might not be enough space in one contiguous area to be used for allocation of large object. Such behavior is known as fragmentation. To eliminate such fragmentation, a method is used compaction.

When stop the world collection is performed, execution of the application is completely suspended during the collection. This is easy compared to concurrent since the heap is frozen and objects are not changing during the collection. The drawback is, the delay it causes due to suspension of application.

It is opposite to stop the world collector. The execution of application is not completely suspended, that is the GC has executed simultaneously with the application. Though it executes concurrently, it has few short stop the world pauses. The advantage is the minimal of pause time here. But the drawback is, the complexity of handing the objects in heap (since the application is not stopped and keep on updating the objects)in run-time.

As I told above, compaction is the option for fragmentation. To combine the small chunks and make a big free space, compact collector is used. Though it takes some time for compaction of space, the next object allocations will be much faster since it need not to search the space based on the object size. Once the garbage collector determined which object is live object and which are garbage, it can compact the memory, moving all live objects together and completely reclaiming the remaining memory. A simple pointer can be utilized to keep track of the next location available for object allocation.

In contrast to compact, it is just releases the space utilized by garbage objects in-place. ie, it does not move all live objects together and create a large reclaimed collection. The advantage is, it does not need time to compact the space but the drawback is potential fragmentation.

Now we see the garbage collection algorithms in detail.

Serial Collector:
It is the default collector used by JVM for client side machines. With the serial collector, the young generation is handled with stop-the-world collector. But the old and permanent generations are collected using “mark-sweep-compact collection algorithm“. In the mark phase, the collector identifies which objects are still alive. The sweep phase, sweeps over the generations (moving) to identify garbage. In compact phase, the collector performs sliding the live objects towards the beginning of the old generation space and leaves any free space in a single continuous chunk at the opposite end.
Where to use: It is a good choice for most applications that are run on client style machines (not server class machines) that do not have a requirement for low pause times.

Parallel Collector:
As defined the meaning of parallel above, it take the advantage of available CPUs rather than leaving them idle which only one does garbage collection.
Like serial, it uses stop-the-world for young generation but in parallel (performs multiple CPUs). For old and permanent generations, it uses the same “mark-sweep-compact algorithm“.
Where to use: Applications that are run on machines with more than one CPU and do not have pause time constraints.

Parallel Compacting Collector:
For young generation – uses the same way it does for parallel collector.
For old and permanent – it slightly differs here with parallel collector. It uses stop-the-world with sliding compaction. First, each generation is logically divided into fixed-size regions. In mark phase, the live objects are marked in parallel with information about the size and location of the object. In summary phase, determines the density of the regions (remember we splitted fixed regions) starting with the leftmost one, until it reaches a point where the space that could be recovered from a region. The right side regions then compacted to eliminate fragmentation.

Concurrent Mark-Sweep (CMS) Collector:
Young generation – It is collected with the same way parallel collector is doing.
Old generation – The initial mark phase, it marks the live objects. In concurrent marking phase, there is no guarantee that all live objects marked since the application also running simultaneously. To handle this, the application stops again for a second pause, called remark which finalizes marking by revisiting any objects that were modified during the concurrent marking phase. In the final sweep phase, reclaims the garbage that has been identified.

One of feature that Java 1.5 introduced is Ergonomics. That is Java gives an option for us to select the type of collector. Previous to this release, we don’t have option to change the default collector. Though it gives an option to change the default values, the default values for garbage collector, heap size, HotSpot virtual machine gives better results in many times.

Below are the JVM arguments to improve the GC performance based on the applications.

-XX:+UseSerialGC –> use serial collector
-XX:+UseParallelGC –> use parallel collector
-XX:+UseParallelOldGC –> use parallel compacting collector
-XX:+UseConcMarkSweepGC –> use concurrent mark sweep collector
-XX:MaxGCPauseMillis –> maximum pause time goal is specified with this command line option
-XX:+PrintGCDetails –> For every collection, this results in the output of information (live objects before and after GC, total available space, length of GC time)
-XX:+PrintGCTimeStamps –> Outputs a timestamp at the start of each collection.

Evaluate JVM Performance:
There are many tools available to analyze the JVM performance. Few are, jmap, jstat, hprof(heap profiler), HAT(Heap Analysis Tool).