Hello Terracotta
I’m getting spooled up at Terracotta this week and I thought it would be helpful for others to blog a simple HelloTerracotta program that illustrates some useful points about Terracotta and how it works.
Below is a simple class that illustrates a couple interesting things about Terracotta:
package test;
import java.util.HashSet;
import java.util.Set;
public class HelloTerracotta {
private final Set root;
public HelloTerracotta() {
if (root != null) { // Item #1
System.err.println(“map is already non-null, size ” + root.size());
}
Set newSet = new HashSet();
root = newSet; // Item #2
if (root != newSet) { // Item #3
System.err.println(“root assignment was ignored”);
}
}
private void go() throws InterruptedException {
for (int i = 0; i < 30; i++) {
synchronized (root) { // Item #4
root.add(root.size());
System.err.println("root is now size " + root.size());
Thread.sleep(500);
}
}
}
public static void main(String[] args) throws Exception {
new HelloTerracotta().go();
}
}
When you run this as a normal unclustered program without Terracotta, it creates a set and adds a bunch of numbers to it:
root is now size 1 root is now size 2 root is now size 3 …etc
Now let's make the root field shared between VMs. Terracotta does this by using a configuration file in combination with a JVM wrapper (not by modifying the source code). So, we'll create a configuration file named tc-config.xml to set up sharing and locking:
<?xml version="1.0" encoding="UTF-8"?>
<con:tc-config xmlns:con="http://www.terracotta.org/config">
<servers>
<server host="%i" name="localhost">
<dso-port>9510</dso-port>
<jmx-port>9520</jmx-port>
<data>terracotta/server-data</data>
<logs>terracotta/server-logs</logs>
</server>
</servers>
<clients>
<logs>terracotta/client-logs</logs>
</clients>
<application>
<dso>
<instrumented-classes>
<include>
<class-expression>test.HelloTerracotta</class-expression>
</include>
</instrumented-classes>
<roots>
<root>
<field-name>test.HelloTerracotta.root</field-name>
</root>
</roots>
<locks>
<autolock>
<method-expression>* test.*.*(..)</method-expression>
<lock-level>write</lock-level>
</autolock>
</locks>
</dso>
</application>
</con:tc-config>
We then need to start up a Terracotta server using a Terracotta distribution (assumed to be in the path):
start-tc-server.sh
and start up a client using the Terracotta client wrapper which works just like your favorite Java executable. Here we'll assume our project is set up with src, bin, and config directories and we are executing from the root of the project. The one extra parameter we need is to indicate the location of the Terracotta config file.
dso-java.sh -Dtc.config=config/tc-config.xml
-classpath bin
test.HelloTerracotta
When we do this, we'll see the same output as before, but there will be an important difference. The root we are adding to is now shared and managed by the server process. Because of this, the objects are managed across clients. So, we can actually run the client again and we'll see something much different:
map is already non-null, size 30 root assignment was ignored root is now size 31 root is now size 32 ...etc
So that was different! Let's look back at the code to see what happened. If you look at "Item #1" in the code, you'll notice that in the second run of the program, root is non-null here before anything has been done to set it. That's because we declared the test.HelloTerracotta.root field to be a root field in the tc-config.xml configuration file. This makes this field "super-static", which means that it is managed by the server and changes are shared across all clients and across restarts. So, here we see that not only is the root non-null, but it still contains all the data from our first run of the program!
Similarly in "Item #2", we note that only the first non-null set of a root field takes effect. So, when the program is run the first time, setting root actually has an effect, but on subsequent executions this line will have no effect. The root has already been set so it is not changed to a new empty set. This also means that in "Item #3", we see that the set had no effect (note the line in the output). These can be strange things to see if you are not familiar with what Terracotta is doing.
In Terracotta, objects are shared across the cluster without using Java serialization. Instead, the bytecode instructions that modify fields are intercepted so that only field level changes are transmitted across the wire. This has a number of benefits: classes don't have to be Serializable/Externalizable, object identity is maintained, and performance is greatly enhanced by sending far less data over the wire. Right now, the important thing to know is that classes need to be instrumented for Terracotta to intercept the bytecodes and take the appropriate actions. Due to this, you will see in the tc-config.xml file above that the HelloTerracotta class must be instrumented.
The next interesting thing to look at is what we need to do to modify data in the root. The root is shared across all clients, and because Terracotta preserves all Java semantics with respect to the Java Memory Model, modification to the root requires proper Java synchronization. So we must synchronize on the root object before modifying it. At execution time, locking on the root object will synchronize access to root across all threads in the cluster. That means that if you run two clients at the same time, they will compete for access to root's monitor, just like two threads in a single VM would. Due to the sleep() in the code, you can see the locking occur if you run two clients. The clients will ping-pong back and forth. (Obviously, you don't normally want to put sleep() calls in your synchronized blocks!).
We should probably also look back at tc-config.xml to see how the locking behavior was established. To indicate that the synchronized lock on root should be automatically locked like normal Java locks (but cluster-wide), we specified a method expression that will automatically create a "write" lock on shared objects for anything matching that method expression. There are a variety of other locking levels to give higher performance or other behavior in other scenarios.
I hope that was a useful first introduction to illustrate a few key points about Terracotta - shared root fields, shared objects without serialization, and cluster-wide locking. Of course, I'm just scratching the surface of all of these topics (and learning them myself). As I learn more, I'll share it with you as well!
If you have questions or want more info, please post to the comments and I'll do my best to answer or find someone who can.
Also, many thanks to Tim Eck for writing the first version of this simple program and walking me through the details!!

Hi! My name is Alex Miller and I live in St. Louis. I write code for a living and currently work for
Nice introduction to Terracota. Couple of questions – If multiple clients are started simultaneously, how is the shared root initialized? Will there be any predictable behavior or will there be contention and data races happen?
You mentioned there are more granular ways of locking, but for the example you mentioned, with test.*.* will it synchronize the whole go() method or just the synchronized block is made cluster wide. I’m guessing it is not on the go() method, as that method is an instance method and the object for that method is not a shared object, Terracota cannot possible synchronize cluster wide on one instance of HelloTerracota.
Great questions, Kishore!
When Terracotta instruments the byte code of this class, cluster-wide locking will be inserted around the get and set of the root field. Essentially the first one that reaches the set will be the winner, although due to the semantics of the super-static root field, callers can check whether they were the first or not by doing the equality check seen in the code. This would allow you to decide whether to do some one-time initialization at that point.
The locking expression will apply autolocks only to code that is already synchronized (methods or blocks) on shared objects. So, previously unsynchronized code matched by the expression is not made synchronized as a result. Your analysis is correct that the HelloTerracotta instance itself is not shared and cannot be used to lock cluster-wide.
Alex,
Your answer is correct, but strictly speaking the way I would describe how Terracotta works when told to lock the go method:
It will add instrumentation around every instance of synchronizable code – so your description that the method and any blocks will be instrumented is corect.
The instrumentation is always there, so what happens if you had used the expression “synchronized (this)” instead of “synchronized (root)”?
The “synchronized (this)” instrumented version would acquire a local monitor but not a cluster lock. Then the root.add() method would execute and Terracotta would discover that a write was being made to an object that is shared, but there is no active Terracotta lock, and would tell you so by emitting a RuntimeException – UnlockedSharedObjectException.
Hi all,
I don’t think using final Set would, that will definitly be throw as an error by the compiler.
Best regards,
@Mouad: When Terracotta instruments the bytecode, it will remove the “final” modifier for roots, so this is not an issue.
If I used private static final Set , whats the behaviour difference , if any
thanks!
None in this case – since that field is a Terracotta root, it is effectively “static” in that the field value is shared by all instances already. The main difference is that you’re still bound by Java language conventions on how and where that field is accessed in static vs non-static context so sometimes it’s more convenient to do one vs the other.