JIE: Solving the Robustness Problem
Posted by rbpasker on April 18, 2008
In his latest polemic, Herr Hildebrand interprets my robustness problems as a dichotomy between the Thread and the Process. (Just joking, Hal!) Like earlier authors, I will also refrain from providing precise definitions, except to bastardize Justice Potter Stewart’s old saw, “I know a [process|thread] when I see one.”
Well, Hal is exactly right that the process model has huge robustness benefits over threads due to the memory and scheduler isolation enforced between processes by the OS and MMU. He is also correct that much robustness is lost when running many applications in a multi-threaded environment (like the JVM), creating operational behavioral dependency among applications that didn’t exist at design time.
But rather than bowing to the false architectural dichotomy between the Process Model and the Thread Model, I propose an architecture that uses each of them to their best advantage, using processes to separate applications, and threads to separate transactions within applications, and also proposing a number of improvements to the JVM that would make this hybrid model as cheap to run as the monolithic JVM model we currently had.
Before continuing, let’s imagine for a minute a JVM that has some special properties:
- it has no transactional state, which most people would read as “share nothing”
- the transactional state is cached locally, and the ACI (but not D) cache is shared across all processes via shared memory
- a fully operational JVM can be launched from scratch, ready to process transactions, in, say, under a second
With such a JVM, it is not hard to imagine the proposed solution: the runtime architecture consist of one jar file (i.e., one application) per JVM, and that each JVM handles multiple simultaneous transactions in separate threads. An errant application could then be recycled by recycling the entire JVM around it. For greater robustness within a single application, multiple JVMs could run the same jar.
What you now have is a mini-cluster that provides the robustness of the Process Model with the multiprocessing and “forking” speed of the Thread Model.
In order to achieve this, the JVM would:
- have to be “pickle-able,” by which I mean that you can run a VM, get it to a steady state ready to process transactions, and then pickle it to disk, just like a VMWare image.
- support sharable read-only data, for all code and constant data
- the ability to pass TCP endpoints around among processes, so that a dispatcher process can funnel transactions to the right JVM without a copy.
I know some of these things have been worked on in the past, but I wonder if some of the specialized JVM vendors who have a stake in enterprise software (Oracle/WebLogic/JRockit, Azul Systems, Sun, IBM) shouldn’t start looking at this problem again.