Saturday, August 16, 2014

Garbage Collection in Java

Java depends on garbage collection for releasing the memory. Developers coming from C/C++ background would realize that the toughest bugs that come in those systems are because of memory issues. Java has attempted to solve that problem by bringing the notion of garbage collection as part of language. The idea of garbage collection is simple. Look at the following code
String[] tempArray = new String[100]
tempArray = new String[50];
The tempArray initially points to a String array of size 100. However in the next line, we make it to point to an array of size 50. The first object of size 100 is now not being referenced by any pointer. This object though have no use in the system, still occupies an area in the memory. To solve this issue, one way is that we release the memory pragmatically, which used to be done in C/C++. However in Java this is automatically performed by garbage collection routines. The idea of garbage collection is simple. The garbage collector, stops the main execution and checks each object and counts the number of references pointing to it. If an object is not referenced by any one or a set of objects are not referenced by anyone outside apart from themselves, these objects become eligible for garbage collection.
The older days garbage collectors were empirical in nature. At some set interval or based on certain condition they would kick in and examine each of the object. Modern days collectors are smarter in the sense that they do differentiate based on the fact that objects have different lifespan. The objects are differentiated between young generation objects and tenured generation objects. Memory is managed in pools according to generation. Whenever the young generation memory pool is filled, a minor collection happens. The surviving objects are moved to tenured generation memory pool. When the tenured generation memory pool gets filled a major collection happens. A third generation is kept which is known as permanent generation and may contain objects defining classes and methods.
The tuning of garbage collection is basically done by providing goals to the garbage collectors, which are also known as ergonomics. The goals may not always be met, but these gaols indicate garbage collector about the boundaries in which it has to operate.
  • Maximum pause time goal - This goal tells the GC that how long an application can be paused for garbage collection. The pause goal is specified with -XX:MaxGCPauseMillis=<n>. n is time in ms. An average and a variance on the pause is recorded by the garbage collector. If the time exceeds the set time, the GC will try to kick in more frequently so that it has less work to do in each GC cycle and hence the pauses can be reduced. In the generational collector, the average and the variance is applied separately on collections in each of the generation.
  • Throughput - Throughput basically tells that how much time an terms of ratio, the application is spending in doing actual work. The gaol is specified by -XX:GCTimeRatio=<n> where the ratio of garbage collection time to application time is 1 / (1 + <n>). If the goals are not met, the GC may increase the size of generations so that the GC has to kicked in at less number of occasions.
  • Footprint - Footprint goals is met by combining the pause time and throughput goal. GC tries to reduce the footprint or size of heap as far as possible with simultaneously meeting the above two goals.

Measuring Garbage Collection
-verbose:gc - prints the information about garbage collection
-XX:+PrintGCDetails - prints additional information
Heap Parameters
-XX:MinHeapFreeRatio - Minimum heap size in %age that should be kept free in a generation
-XX:MaxHeapFreeRatio - Maximum heap size in %age that should be kept free. If the heap size increases above it, the generation pool is decreased.
-Xms - Minimum heap size
-Xmx - Maximum heap size
-XX:NewRatio - Ratio between young and tenured generation.
-XX:SurvivorRatio - Ratio of survivor space to eden.
-XX:+PrintTenuringDistribution - Threshold and ages of object in young generation.
Types of collectors:
  • Serial Collector - The serial collector is the traditional collector which works serially with the application. The application is paused when the garbage collection happens.
  • Parallel Garbage Collector - Uses simultaneous multiple threads to do garbage collection. Can use multiple processors of the machine to do that.
  • Concurrent Collector - The collector can work by sharing processor resources while actual application is executing. This also results in shorter garbage collection cycles.

No comments:

Post a Comment