A week ago Oracle massively exceeded expectations, completely crushing the hopes of the doomsday crowd. Part of those results stems from the growth in Fusion Middleware, which includes BEA WebLogic. This past quarter I saw a big jump in the number of accounts switching from WebSphere back to WebLogic on the high end, and (quite surprisingly, considering the economic climate) from JBoss to WebLogic. I could mark it up solely to a great sales organization, but there are some really interesting things at work that have made a huge difference in the ability to win business with WebLogic:
WebLogic is getting a higher level of technology investment as part of Oracle than it has seen in a long time, and those investments are starting to pay real rewards for our customers.
We showed a plan for WebLogic and the other BEA products. We have been steadily delivering on the plan. Customers appreciate the “ability to execute” combined with actual execution.
Several of the technologies in the Fusion Middleware stack just can’t be found elsewhere. Start with jRockit, the real-time JVM that dominates all of the performance records (see here, here, here and here for example.) Add reliable cross-platform, multi-portal session clustering with single sign-on support via Coherence*Web, and the ability to globally load-balance a single application across multiple data centers. Top it off with end-to-end operational monitoring. Wow!
Now that sonic.net has added self-service subdomains, you can now read this blog at: http://blog.pasker.net . i would have added theabstracttruth.pasker.net, as well, but WordPress wants $10 for each separate domain, and it didn’t seem worth it.
I recently gave a presentation to the technical management and architects at a Very Large Web Property on messaging systems, AKA Message-Oriented Middleware, AKA Enterprise Services Bus, etc.
This was not meant to be an introduction, because they already have messaging systems in use. Rather, I wrote this presentation (in about 15 minutes) as a springboard to remind everyone what messaging system are capable of, so we could talk further about how to capitalize on the products they were already using.
Wow, 3.5 million tweets per week, not counting locked and direct tweets. That’s about 350 tweets PER MINUTE (rules of thumb: ~10K minutes in a week; ~PI*10^7 seconds in a year). Even if the number of private tweets and direct messages drive the volume 5x to 1750 tweets per minute, its still not much. A 1750-person company delivers an average of 1 email message/minute/person. Even Exchange can handle that.
The data feed produced by the Options Price Reporting Authority contains every transaction on every stock and option exchange around the world. The current projections for OPRA are here, but let me republish the table for you:
Yes, that’s 701,000 messages PER SECOND, or about 423 BILLION messages per week.
The netisatwitter with solutions to the “Twitter Problem,” which is that Twitter keeps going down, and it is annoying the bloggers themselves, who use it as a broadcast platform.
But there are really a bunch of different problems, many of which will be solved independently:
Decentralization, a la DNS, of the network of servers/services providing Twitter-like functionality, for redundancy.
Scalability using a server-side implementation that can handle the volume of messages on the busiest nodes efficiently. I’ve proposed Tervela’s product as the hub
Inverting the protocol from “pull” to “push,” to reduce the amount of unproductive polling that takes place.
A universal addressing scheme, based on existing standards, like RFC 822’s mailbox addressing. (This is what Jabber uses). A URL-based system would work, too.
Proper language mappings (such as Actors) that are designed to handle large numbers of messages with minimal overhead.
A common API/wire format, hopefully something much simpler than XMPP, to make it simple to write applications that can participate in this network.
Gateways to other messaging systems, such as email and SMS
The good thing is, lots of people are finally starting to think about this problem. The bad thing is, lots of people are finally starting to think about this problem (which means it will take a LONG time to solve).
Well, Hal is exactly right that the process model has huge robustness benefits over threads due to the memory and scheduler isolation enforced between processes by the OS and MMU. He is also correct that much robustness is lost when running many applications in a multi-threaded environment (like the JVM), creating operational behavioral dependency among applications that didn’t exist at design time.
But rather than bowing to the false architectural dichotomy between the Process Model and the Thread Model, I propose an architecture that uses each of them to their best advantage, using processes to separate applications, and threads to separate transactions within applications, and also proposing a number of improvements to the JVM that would make this hybrid model as cheap to run as the monolithic JVM model we currently had.
Before continuing, let’s imagine for a minute a JVM that has some special properties:
it has no transactional state, which most people would read as “share nothing”
the transactional state is cached locally, and the ACI (but not D) cache is shared across all processes via shared memory
a fully operational JVM can be launched from scratch, ready to process transactions, in, say, under a second
With such a JVM, it is not hard to imagine the proposed solution: the runtime architecture consist of one jar file (i.e., one application) per JVM, and that each JVM handles multiple simultaneous transactions in separate threads. An errant application could then be recycled by recycling the entire JVM around it. For greater robustness within a single application, multiple JVMs could run the same jar.
What you now have is a mini-cluster that provides the robustness of the Process Model with the multiprocessing and “forking” speed of the Thread Model.
In order to achieve this, the JVM would:
have to be “pickle-able,” by which I mean that you can run a VM, get it to a steady state ready to process transactions, and then pickle it to disk, just like a VMWare image.
support sharable read-only data, for all code and constant data
the ability to pass TCP endpoints around among processes, so that a dispatcher process can funnel transactions to the right JVM without a copy.
I know some of these things have been worked on in the past, but I wonder if some of the specialized JVM vendors who have a stake in enterprise software (Oracle/WebLogic/JRockit, Azul Systems, Sun, IBM) shouldn’t start looking at this problem again.
What I would like to see is an automatic solution to the problem, one that detects such a deadly embrace and chooses a victim to kill. Detecting the deadlock would have to be heuristic, in the sense of watching the locks to see how long they usually take, and considering only those locks which exceed the normal holding time.
In a distributed locking case, such as with Terracotta, the deadlock detector could System.exit(), let the other VM continue along, and the management system would automagically restart the victim VM. It wouldn’t prevent the problem from happening again in 10 seconds, but it might at least ring lots of bells so someone can come look at the problem, rather than having the whole cluster deadlocked. In the single VM case, we’d have to wait for a proper solution to Thread.stop(), which I also talked about yesterday.
The other question I have about deadlock detection is whether some of it can be done via static analysis, but this is not my area of expertise. An alternative would be to use AOP to instrument the locks. I’m sure someone has already done this.
One of the characteristics that makes application infrastructure so unique is that it is a container for other people’s random code. Java generally does a pretty good job of dealing with loadable code by providing such features as class loaders, a code security model, a component security model, and a threading architecture.
There are, however, still a number of robustness issues with Java as it exists today:
hot code relaoding – this is an age-old robustness problem, especially for operational issues like rolling upgrades. and I think the Zero Turnaround guys may have a splendid general purpose solution in their Java Rebel product.
runaway thread healing – These are threads that go into an endless loop or permanent I/O block. We used to have the ability to set an ExecuteThread timeout in WLS, whereby a watchdog timer would kill any thread that didn’t complete within a configurable time period. But then Sun deprecated Thread.stop(), and suggested instead cooperative thread death using a state variable. This abdication of responsibility for robustness from the VM to user code is similar to the cooperative transaction manager timeout, about which my colleague Pete Holditch says as
There is no easy answer – there isn’t really a facility in J2SE or J2EE as they stand today to allow a thread to be safely and asynchronously terminated.
I’d like to see a permanent solution to this problem, even if it means implementing transactional memory in the JVM.
memory quotas – Another great way to test an application server’s robustness is to leak memory. Providing a quota system that limits (hopefully, heuristically) the ability for a component to allocate memory would prevent bad code from killing the whole server with an OOME.
deadlock management – Before you go hitting the “comment” button, note that I went through Distributed Lock Manager hell in VMS 25 years ago, so I know the pitfalls here. Nevertheless, Azul has done somegreat stuff (pdf) in this area, and I think its ripe for attention.
So rather than just complaining, here are some real-life problems that Java Platform, Infrastructure Edition could solve.