CopyOnWriteArrayList concurrency fun

9

I mentioned CopyOnWriteArrayList in the prior concurrency bugs post. Coincidentally, I happened to need to look at the source for it today and it relies on some nice but not particularly obvious concurrency code.

There are a couple of interesting patterns to ensure within the code that both reads and writes are safe. The key to both is that the data for the array is stored internally in a volatile array and that the values in the array are immutable. Both the volatile and the immutable aspects are important.

Reads against the list do not use synchronization. Instead they rely on always seeing the most recently modified version of the internal array due to the volatile. As you may or may not know, volatile arrays do not have volatile elements (probably a good topic for a concurrency bug post itself). So, a write to an index in the array would not necessarily be seen. However, the array is always immutable so this is never an issue. So that covers reads.

Writes are different. To protect against simultaneous writes, any mutation methods are synchronized (similar to Vector). Internally, the modification is done by first copying the original list, performing the modification, then swapping the volatile array with the new array. So the synchronization protected writes against writes. And the volatile array reference guarantees that reads and writes see a consistent array.

Iterators are supposed to see an old snapshot, so how does that happen? Well, when the iterator is created, it just gets a reference to the current array. Since it’s not going back to the variable in the original list, if that list replaces the array with a new instance, the iterator is no longer seeing the live array, just the array at the time it was created.

Good stuff.

Comments

9 Responses to “CopyOnWriteArrayList concurrency fun”
  1. Shay Banon says:

    This is exactly the same trick for implementing a copy on write hash map as well.

    By the way, does terracotta support this type of implementation? Does it support volatile semantics?

  2. Steve says:

    Might be interesting to talk about clustered CopyOnWriteArrayList. Their were some interesting discussions around it.

  3. Alex says:

    Well, you might infer that I’m looking at it for a reason… :)

    Terracotta does not (yet) support it but I am looking at adding it. Because all access to the immutable array goes through one method, it is fairly easy to write lock every write method and read lock the one read method to get a valid distributed version. Due to Terracotta greedy locks, reads should continue to generally be fast (no network overhead) but writes will be an even greater cost than before due to obtaining a cluster-wide write lock.

    Terracotta has no way to support volatile the way the JVM supports it (since the changes are occurring in the cluster, not at the hardware). So, clustered volatiles need to be changed inside synchronization just like any other clustered state. We actually have internal support that can do this automatically for you (wrap a field-level write lock around a volatile) but no one has gotten around to exposing it in our config yet.

  4. Tim Eck says:

    This is a complete nitpick but the array in there isn’t really immutable…it is perfectly mutable (all arrays are) but rather it is not mutated. Immutable has a very specific meaning to me obviously :-)

  5. Alex says:

    @Tim: Absolutely right as usual.

  6. Alex says:

    @Taylor: The Clojure data structures differ in an important regard from the Java CopyOnWriteArrayList. On a write, COWAL copies the entire data structure, which makes writes suck. The Clojure data structures are “persistent” and use structural sharing. Instead of copying the entire data structure, they reuse almost all of the existing data structure, which means that writes are actually very efficient. The key thing of course is that the Clojure structures are always immutable which allows them to safely be reused in whole and part.

  7. Taylor Gautier says:

    @Alex – oh, I know, I said “related” :) The Clojure system reminds me of the NetApp filesystem, actually. And also some of the experimental ideas for a COWTreeMap Geert and I dreamt up so long ago…

  8. Jed Wesley-Smith says:

    sorry, previous comment got stuffed up…

    Our package has both CopyOnWriteMap and CopyOnWriteSortedMap. These implementations can use any required implementation as the internal storage, and so support HashMap, LinkedHashMap, TreeMap etc. – as long as the “non-mutative” methods do not actually modify the internal structure, such as a LinkedHashMap with access ordering turned on.