Introduction to Terracotta
1/5/2009I've been struggling to explain how much of a leap forward Terracotta is for Java programming. Of course, Terracotta is a cluster in the fullest sense of the word, but "cluster" brings to mind many painful years of parallel development. They required extensive setup and a lot of code changes. Terracotta is different.
Terracotta works at the JVM level, silently modifying the behavior of some classes you choose to share. Suddenly you can use a ConcurrentHashMap across the cluster, without code changes, in an efficient and transactional way. Now not only do you have all of the benefits of a cluster environment, but you can run your own code without change -- or even a vendor's product. They have support for many common Open Source projects and you can even run containers from Tomcat to WebLogic on your cluster. Or all at once, if you prefer.
Terracotta is also sometimes described as "network attached memory." That's apt, but it doesn't, you know, sound cool enough. For many, it takes a bit of experimentation with Terracotta until they realize: share memory and _ do anything _ . It's not a very interesting program that doesn't use memory. In fact, most programs exist to do some computation on memory or data.
The pain starts when you need to share that memory with another process or system. There have been all sorts of methods. We had COBRA, but that was a bummer because of the IDL files. We've had RMI and Mule big enterprisey XML and 10,000 other ways to simply move a piece of data. Many programmers even threw up their hands and used the database as an Inter-Process Communication (IPC) method, storing memory that had no business in the database.
Those systems can often be replaced by a simple LinkedBlockingQueue. If every node on your cluster starts a thread that waits on the queue, then you can simply put data in the queue for other nodes to retrieve. Checkout the tim- messaging (TIM: Terracotta Integration Module) project on the forge for a great library that adds all of the goodies needed for a reliable solution.
Even better, Terracotta lets you program as if you're on a single JVM and deploy to the cluster. Sharing memory in your cluster is no more difficult than sharing in a single process. That means setting up and testing a message bus or writing for the cluster can easily be done on a single developer's machine.
If that doesn't sound like a huge leap forward, consider that I once spent several weeks trying to get my personal OpenMOSIX cluster to migrate processes. I never did figure out what black magic they used, but finally, after a week or two of hacking, I started seeing my jobs move across the cluster. And for all my effort... it took twice as long as the single process version.
Of course, that was long before I'd even used Java and when I still thought C
was the only language I'd ever need. Never did I imagine declaring a
LinkedBlockingQueue
Caching
The Achilles heel of parallel and cluster programming is usually the data. If a job to be executed across the cluster requires every node to run a SQL statement, then any benefits to the cluster model are quickly lost. The database will quickly become the bottleneck that no amount of cluster optimizations will solve.
That's why I recommend the first step to Terracotta is to use it as a distributed cache. After all, once your data is in the cluster, then acting on it from several nodes becomes a real possibility. Instead of shipping data around (and probably spending far too much time deciding which node gets what chunk and sending it to them), the program can simply act on the cache. Terracotta has already spent the time making that as efficient as possible.
Terracotta makes for a great distributed cache solution, too. It makes sure that all of your transactions are ACID (Atomicity, Consistency, Isolation, Durability), the same standard databases are held to, but it also ensures programmers can use familiar Java language constructs to manage memory. This makes it possible to rely less on or even consider killing the database altogether.
Unlike many solutions, Terracotta doesn't depend on the slow Java serialization process to manage objects in the cache. It's tightly integrated with the JVM so it can do far smarter optimizations. For example, when a cluster node requests a map entry from the cache, Terracotta doesn't have to ship the entire object tree. It only makes available the entries as they're needed. That alone saves metric tons of network usage.
Terracotta has many other great features as a cache. Their documentation is quite fond of calling the cluster nodes the L1 layer and the cluster server the L2, similar to how a CPU cache works, for a good reason. As memory fills up on the cluster node, the client can flush it's less frequently used memory. The server is responsible for the consistency of the data so the cluster client is not relied on for data integrity. The server process also keeps track of it's hot memory and can persist it's own data to disk as memory fills up on the server JVM and remove old data from it's heap. You can even configure checkpoints on the disk storage for backup purposes.
All this means a node can easily access a 2G Map instance with a smaller client heap. Of course, performance won't be as great as there will be overhead depending on how the memory is used, but the most frequently requested data on a JVM will be in main memory. Rather than memcached or other solutions, getting a value is nearly as fast as using the Java heap because it often is on the heap.
Terracotta's redundancy also allows programmers to depend on their cache. After all, if the program is running, then the cache is working. This is a far cry from many other solutions that discourage design patterns like asynchronous database writes or even enumerating the cache's keys because the cache is not persistent. With the reliability of Terracotta backing the cache, such a design is easy to envision. There's even an integration module aimed at just such a use.
High Availability
All this effort adding caching didn't go to waste. While you weren't looking your web applications also gained some great High Availability (HA) features. Terracotta's engineers were probably bored after they solved concurrent programming.
Terracotta Sessions will store and manage session data, meaning moving from one server to another doesn't mean users must log into the application again. Session state can be preserved and trusted. With the addition of a nice load balancer, scaling is simply a matter of adding another node to the cluster. Not only does the distributed cache ease the load on the database layer, but scaling the application layer is completely transparent to the end user.
Also, since session data is managed by Terracotta, server admins can now bring down cluster nodes for maintenance without disrupting the application. In fact, you can even patch and upgrade Terracotta itself without stopping the whole cluster.
For the cluster itself, Terracotta can be configured to withstand the loss of a server. Data integrity is ensured at all levels.
Open Source
As if creating a complete end-to-end solution for some of the most difficult scaling problems in Java applications wasn't enough, Terracotta is also Open Source, has great documentation, and their developers respond quickly to forum questions.
It is a much easier sell to management and other developers when there's no up-front cost to experimenting. Of course, before moving to production most organizations will require some kind of training and support, which Terracotta also provides.
I encourage you to download and play with the example applications. It'll provide a good taste of what's possible. Terracotta is truly a leap forward. With a little experimentation, you'll quickly discover simple solutions for whole classes of difficult problems.
Watch this space for upcoming articles on Terracotta. The first is tentatively titled, "Terracotta: a tale of two projects."