Actors in Groovy

4

I was reading this morning about GParallelizer, an actor library in Groovy and while it has a pretty face, I think it misses the point of using an actor model.

To me, the heart of the actor model is to break free from a per-thread model and instead use larger numbers of lightweight processes (lighter weight than threads) and pass data between them. Some crucial aspects include:

  • message-passing – the key thing with message-passing is to make it thread-safe and efficient. One way to do that is with immutable and structurally shared data structures that minimize copying and guarantee thread-safety. Other ways are either slower (due to copying) or trickier (due to needing to understand synchronization).
  • multiple-producer, single-consumer queue – the actor mailbox needs to be optimized internally for concurrency in this scenario
  • pattern-matching – pattern-matching needs to work in tandem with the actor mailbox and possibly the scheduler to avoid blocking things that can do useful work or spending excessive time trying to decide whether the actor has a message to process
  • scheduler – efficient user-space process scheduling of a large number of processes over a smaller number of kernel threads
  • task preemption (basically continuations) – this is necessary to allow only a subset of actors to actually be running at any given time – the scheduler needs to be able to preempt a process and allow another to run

The reason that Erlang works so well for efficient actor-based programming is that they’ve been working on all of these aspects for 20 years. Getting some of these aspects to work well on the JVM or Java is challenging due to the lack of support for continuations in the JVM. I think Scala has done an excellent job of leveraging the extensibility of the language to implement these ideas. In Java, you need to go to bytecode modification at either compile- or run-time to achieve effective continuations (both of which have been done in Java actor frameworks).

The example given with GParallelizer of merge-sort is an interesting one as I think it’s a perfect example of a computation-heavy divide-and-conquer algorithm that could be more cleanly solved using something like the fork/join framework which will likely be included in Java 7 as part of JSR 166y.

To me, the actor model is more appropriate for architectures that have many independent actors (ie much greater than # threads), driven by messages, effectively implementing state machines. You could map these in an OO world to objects receiving method calls, and there are a lot of similarities. Those kinds of setups require those qualities listed above to work efficiently. The benefit is that because there is not a hard mapping of actor to thread, the system can achieve greater scale by just increasing the number of cores (and therefore the number of true parallel kernel threads) and having the scheduler manage the mapping (without changing the program). It’s useful to note that at a low level, the actor ultimately still needs something like fork/join and in fact I believe the Scala actors are built on top of the fork/join framework.

Comments

4 Responses to “Actors in Groovy”
  1. James Iry says:

    Regarding continuations and byte code manipulation:
    Scala’s actors are just an ordinary library in Scala. The tricky bit is that they use exceptions as one shot upwards continuations in order to trampoline away from unbounded stack growth. A Java library could do the same; there’s just no way you could make it as pretty in Java as it is in Scala.

    And you’re right, the underlying thread-pool mechanism for Scala’s actors is fork/join style work stealing queues.

  2. Kirk Wylie says:

    Something else to bear in mind here w.r.t. the JVM is that Green Threads (a big part of the Erlang VM) are something that the JVM had big-time back in the day, and backed off of. Reintroduction I think would be key to getting more efficient support for actor-based concurrency into the JVM.

  3. @Kirk

    If the actor doesn’t take ages to complete, the actor ‘scheduler’ can run multiple actors in a single context switch.

    It depends a lot on the internals of the actor framework if user space threads are required.

  4. Vaclav Pech says:

    Thanks for uncovering a bit of the Scala’s implementation of event-driven actors. I hope I’ll be able to mimic that in Groovy.
    In my opinion, having green threads on JVM would make my life with GParallelizer much easier, since I would get loose coupling between actors and system threads for free.