Cassandra: Garbage Collector Introduction
In this blog post we will discuss about Garbage Collection with respect to Cassandra 3.0 or later. Optimizing Garbage Collection is very important in Cassandra. It can improve performance ~5 times performance and save huge money.
What is Garbage Collection?
Garbage collection also known as GC is an important feature of Java. Garbage collection is the mechanism used in Java to de-allocate unused memory, which is nothing but clear the space consumed by unused objects. To de-allocate unused memory, Garbage collector track all the objects that are still in use and it marks the rest of the object as garbage.
Choosing a right Java Garbage Collector:-
Cassandra 3.0 or later use G1 or CMS(Concurrent-Mark-Sweep) garbage collector.
We recommend to use G1 GC(Garbage collector) in below condition:-
- If Heap Size is between 16GB to 64GB.
- G1 perform better than CMS for larger heaps because it can scan the regions of the heap containing the most garbage objects first and it compact the heap on-the-go. While CMS stop the application when performing garbage collection.
- G1 is self tuning and easy to configure.
- The Workload is variable i.e Cassandra Cluster can perform different process all the time.
- CMS will be deprecated from Java 9 or later.
CMS(Concurrent-Mark-Sweep) should be use in below condition:-
- If Heap size is <=16 GB.
- Workload is fixed i.e. the Cassandra cluster perform the same process all time.
- Application require lowest latency possible, G1 incurs latency due to profiling.
- You have expertise and time to manually tune and test the garbage collection. Be aware that allocating more memory to the heap, can result in diminishing performance as the garbage collection facility increases the amount of Cassandra metadata in heap memory.