JavaOne: G1 Garbage Collector
The Sun Hotspot guys have been working on a new garbage collector to replace CMS called G1. This presentation went over the differences between the old CMS and the new G1 collectors and also included some perspective from a guy at the Chicago Board of Options Exchange who has been beta testing it.
CMS divides the world into the young and old generations. This is done to take advantage of the observation that the lifetime of objects is highly uneven – the vast majority of objects die young glorious deaths and a very small number of objects live for a very long time (effectively the life of the app). Also important is that there tend to be very few references from the old generation to the young generation. Because of this, it’s ok to focus our collection attention on the young gen.
In CMS, new objects are created in the young generation which is further broken up into eden and two survivor spaces. Young gen GC checks to find live objects and those are put either in a survivor space or in the old generation, depending on age. Old gen gc is mostly concurrent but does stop-the-world pauses to finish up. Also stop-the-world for reference marking. Old gen gc is fragmented and sweep finds holes and manages in free lists. There is a fallback to full stop-the-world collection and compaction.
G1 (“garbage first”) takes a different approach – all memory (except perm gen) is broken into 1 MB “regions”. Young and old are both comprised of some set of non-contiguous regions but these change over time. During young gc survivors of a region are either copied to a new young gen region or to an old gen region as appropriate.
In G1, the old generation GC there is one stop-the-world pause to mark. If any region is found to contain no live objects, the region is immediately reclaimed (this happens more frequently than you’d expect due to locality). Then compact old regions into new old region. Old gen collections are piggybacked on young gen collections.
The technique for how G1 manages references into a region is called “remembered sets”. Every region has a small data structure (<5% of total heap) that reduces work needed to do marking. The remembered sets contain all external references into that region (references within the region are not included).
After this initial layour by Tony Printezis (who was entertaining and explained things well), Paul Ciciora talked about how they test things at CBOE. Probably most important Paul said it is still a work in progress and not production-ready yet.
One interesting item from the Q&A was that this will definitely be in Java SE 7 (probably committed in next few weeks) and that it will also be released in Java 6 update as well.

Hi! My name is Alex Miller and I live in St. Louis. I write code for a living and currently work for
Thanks for all these updates from JavaOne!
I can’t stand the suspend… please tell me
“Also important is that there tend to be very references from the old generation to the young generation.”
@hung: few
@hung, @radu – sorry about that, fixed. Too much fast typing this week!
Thanks for the summary – very interesting. However, what’s the difference between the G1 approach and train garbage collection? It sounds very similar: Small “train wagons” that are in turn collected and reused, and “remember sets” that keep track of what other objects point to the current wagon. Train GC had really small individual garbage collection time slots, and thus was great for real-time systems (in principle), but was removed from Java because it simply performed too bad overall, as far as I know.
Thanks Alex for the heads up information. Useful for analysis of the new JDK and help easier decision making whether to adapt the new version or not.
Sounds very interesting. Would be nice if we can see some benchmarking details also.