Tuesday, July 7, 2009

How Hotspot Decides to Clear SoftReferences

I got asked about this twice in one day, and I didn't know the answer, so I sat down and puzzled it out a bit.

A SoftReference is a reference that the garbage collector can decide to clear if it is the only reference left to an object, and the GC decides (through some undefined process) that there is enough memory pressure to warrant clearing it. SoftReferences are generally used to implement memory-sensitive caches. A mistake many people make is to confuse it with a WeakReference, which is a reference that, if it is the only reference left to the object, the garbage collector will aggressively clear.

I got asked how Hotspot's GC actually decides whether to clear SoftReferences. Twice. On the same day. (Hotspot is the Sun / OpenJDK JVM). The answer was that I had no idea, so I had to go look it up. Here is what I found out; I hope it is a useful reference for someone (pun intended).

First, there is a global clock variable that is set with the current time (in millis) every time a garbage collection occurs. Every SoftReference has a timestamp field that is set to the current value of clock when it is accessed (when it is constructed or the get()) method is called. This gives a very coarse ordering over the SoftReferences; the timestamp indicates the last GC before they were accessed.

When a garbage collection occurs, the decision to clear a SoftReference is based on two factors:
  1. How old the reference's timestamp is, and
  2. How much free space there is in memory.
The calculation is pretty simple. If:
  • free_heap is the amount of free heap space in MB,
  • interval is the time between the last GC's clock and the timestamp of the ref we are currently examining, and
  • ms_per_mb is a constant number of milliseconds to keep around a SoftReference for each free megabyte in the heap
Then the decision is made by:
interval <= free_heap * ms_per_mb
To take an example, let's say that we have a SoftReference with a timestamp of 2000ms, the last GC's clock time was 5000ms, the ms_per_mb is 1000 and the free space is 1MB. We then test whether:
5000 - 2000 <= 1 * 1000
This is false (3000 > 1000), so we clear the reference.

Now let's say there is more free space — say, 4MB — and so less reason to clear the SoftReferences. The calculation is now
5000 - 2000 <= 4 * 1000
This is true (3000 <= 4000), so we don't clear the reference.

One thing to notice about this is that it implies that SoftReferences will always be kept for at least one GC after their last access. Why is that? Well, for the interval, we are using the clock value of the last garbage collection, not the current one. As a result, if a SoftReference has been accessed since the last garbage collection, it will have the same timestamp as that garbage collection, and the interval will be 0. 0 <= free_heap * 1000 for any amount of free_heap, so any SoftReference accessed since the last garbage collection is guaranteed to be kept. This is actually how this question came up; some of my colleagues notices their SoftReferences weren't being cleared during a garbage collection, and they didn't know why.

ETA: The above paragraph originally said "one Full GC", not "one GC". In our case, it was one full GC, because the objects being allocated were too big to be allocated in the young generation. In most cases, it will be one GC of the generation in which the object was allocated, which will usually be the new generation / TLAB.

Another thing to notice is that the value of 1000 for ms_per_mb is fairly arbitrary. It can be adjusted with the JVM flag -XX:SoftRefLRUPolicyMSPerMB=n. If you adjust it down, then free_heap * ms_per_mb will be smaller, and so SoftReferences are more likely to be cleared. If you adjust it up, you get the opposite effect.

22 comments:

mindas said...

You said "SoftReferences are generally used to implement memory-sensitive caches". I was actually thinking about this myself - why caches don't get implemented using Weak/Soft references, then a good chunk of algorithm becomes so easy: you don't have to worry about item removal from the cache (map generally).

So I asked this question to Gil Tene (Azul Systems) at Devoxx last year. His answer was short - "you don't want your algorithm to be heavily dependent on GC behavior, do you?". This is actually very interesting point as GC can indeed change from one JVM to another.

Could you elaborate on this?

Markus Kohler said...

Very good post!
IMHO using Softreferences in an application server is evil, because of this very coarse grained way to configure their behaviour.
Note also that the mechanism implies that Softreferences will (in low memory situations) only be reclaimed after more than one full GC.

Jeremy Manson said...

@mindas - This is precisely the problem that my colleagues encountered. They really want an LRU cache for what they are trying to do, and they tried to achieve it cheaply using SoftReferences.

Sadly, there is no easy way to do what they do - clear memory in response to memory pressure. You could, for example, do it by checking the available memory every so often in a thread, and having that thread clear part of your cache, but then you have to worry about timing and concurrency issues.

Speaking of timing and concurrency issues, another problem with using them for caching is that it is not immediately obvious how to make a Map with them that is thread-safe. Fortunately, my fellow Googlers Bob Lee and Kevin Bourrillion have come up with a very handy class for writing all sorts of caches, including Weak/SoftReference ones. I think it should probably end up in the JDK in one form or another.

@Markus - agreed about them being evil. It is sadly true that there is no other really good way of doing what they do in Java.

Sam said...

You actually can measure memory pressure directly with the

http://java.sun.com/javase/6/docs/api/java/lang/management/MemoryMXBean.html

Using that you can watch the heap sizes post-GC and take actions based on the data you collect.

Brian White said...
This comment has been removed by the author.
Mark said...

Just to clarify the issue regarding SoftReferences always being kept for at least one GC after their last access, rather than one full GC. Does this mean that the global clock variable is set on every GC rather than just on a full GC?

Jeremy Manson said...

@mark - Any GC. Sorry about that!

Paul Baclace said...

Here is a related question that I is resistant to easy answers:

GC interacts with Reference and ReferenceQueue. When a SoftReference referent is cleared by GC, is there a guarantee that no thread can see the old reference, except for PhantomReference?

Does GC have any happens-before guarantees?

The memory model docs and Java Concurrency in Practice (Brian Goetz, et al) are silent on this. Bill Pugh once wrote that Reference.referent should be volatile to prevent a data race, but jdk1.6 source shows it is not.

Jeremy Manson said...

@paul - GC itself doesn't create any happens-before guarantees.

When a SoftReference referent is cleared by GC, it is guaranteed that only SoftReferences refer to it (if you look around for the definition of "softly reachable", you will find this). The SoftReferences to an object are all cleared atomically. Therefore, no other thread will have a reference to the object when it is cleared.

Bill is correct about the fact that the referent should be volatile; this is mostly to share it between threads before it is cleared, though. There is no danger of seeing a reference to a cleared object (this would violate several other guarantees that the VM provides)

Paul Baclace said...

"referent should be volatile [...]"

Looking again at the Bill Pugh statement (from java memory model list, 2004):
``
The only cases that we really need to worry about are:

thread 1 calls clear, thread 2 calls get

GC clears reference, thread 2 calls get

which are handled by making the referent field volatile.
''

The first case is about clearing the reference before GC does it. In the absence of a volatile referent, a subclass of SoftReference can wrap get() and clear() using synchronized(this) to prevent data races.

The second case caused my concern about lack of explicit GC happens-before guarantees. I think the atomicity requirements of Reference implies happens-before, but it is not part of the list of happens-before cases in the memory model doc.

Another concern about Reference.referent not being volatile: This appears to be an example of unsafe publication in "Java Concurrency in Practice" sec. 3.5.1

But in practice, if thread 1 constructs a SoftReference, and thread 2 uses that Reference, as long as there is a happens-before between threads 1 and 2, the memory caches should be synchronized w.r.t. the referent. For example, if the SoftReference is passed from thread 1 to thread 2 via a ConcurrentHashMap as a stored value, the referent should be intact.

Jeremy Manson said...

@Paul - it is only unsafe publication if multiple threads touch the SoftReference. In the post you mention, Bill was talking about making Reference and friends thread-safe. This was never done. There is nothing in the JavaDoc that guarantees its thread safety.

Now, arguably, someone should take up the flag for this. I'll talk to some members of the core libraries cabal.

maninder said...

Apologies as this post does not directly relate to the subject. I was wondering if there are any good open source tools for analyzing the output from gc details emitted –verbose GC -XX:+PrintGCDetails ?

Jeremy Manson said...

@maninder - Nope, sorry. Google has its own tool to do that, but we're unlikely to have the time / energy needed to make it open-source. Having said that, it isn't very complicated - it is just a python script to parse the output.

Paul Baclace said...

For GC visualization, look at gcviewer:

www.tagtraum.com/gcviewer.html

I used it a lot it 2006-2007. The zoomable diagrams it makes are very cool looking and worth studying.

(When I used it in 2006-2007; it would complain about some log lines produced by newer GC modes, but was still useful.)

Golovach Ivan said...

Hello, Jeremy.
When you say "The SoftReferences to an object are all cleared atomically. Therefore, no other thread will have a reference to the object when it is cleared."
You mean No-References-At-All or only No-References-and-SoftReferences?
Why i ask - because i read article = Jonathan Amsterdam, Adjunct Associate Professor of Computer Science at New York University and president of Astrel - "Working with the garbage collection"[http://www.ddj.com/showArticle.jhtml?articleID=184404000&queryText=finalise] and read so "This problem occurs because the clearing of a soft reference does not imply that the object is completely unreachable. When a weak reference is cleared, there is truly no way to reach the object (not even through another weak reference -- the specification requires that all weak references to an object are cleared atomically)."

Jeremy Manson said...

@Ivan - I should have been more precise - I wasn't really thinking about WeakReferences when I wrote that. It is true that WeakReferences can also refer to softly reachable objects when the VM decides to clear them. The WeakReferences will also get cleared if the VM decides to clear SoftReferences.

This all happens atomically, though, so the atomicity property doesn't really enter into it.

Anurag Saxena said...

I suppose instead of 5000 - 2000 <= 1 * 1000 correct expression would be 5000 - 3000 <= 1 * 1000

Jeremy Manson said...

@Anurag - yes, thanks. Fixed.

InJeNiErO said...

Hi, I think it is worth to take a look at this related bug to SoftReference and GarbageCollection: SoftReferences cause worst-case garbage collection

Luca Garulli said...

I'd like to catch when the GC passes and clear soft references. Can I hook the GC? I tried to create a dummy SoftReference to intercept when it's enqueued (by extending the enqueue() method) but I got OutOfMemory and my enqueue() method is never called. Do you have a way to be called when the GC needs memory?

Jeremy Manson said...

@Luca: You can't run code when garbage collection is actually happening. You can use finalizers to simulate some of this, but you have to do it carefully.

Arend Rensink said...

Very useful information, but the parameter "how much free space there is in memory" is more ambiguous than one might suspect. This point has also been remarked upon by InJeNiErO: the "free space in memory" is calculated on the basis of what will be free after clearing all non-soft references. If that is a large amount (it may well run into the 100s of MB or even GBs) then the soft references will only be cleared after minutes, in spite of many full GCs. -XX:SoftRefLRUPolicyMSPerMB is your friend!