The Abstract Truth

JIE: Solving the Distributed Lock Management Problem

Posted by rbpasker on April 18, 2008

Yesterday I wrote about the pitfalls of trying to do Distributed Lock Management. Well, it turns out my friends at Terracotta have been hard at work on an awesome management tool to let users detect the problem in a console. Nice job! I wonder, however, how easy it will be for a developer to spot the problem in the case where there are more than two locks involved in the deadlock.

What I would like to see is an automatic solution to the problem, one that detects such a deadly embrace and chooses a victim to kill. Detecting the deadlock would have to be heuristic, in the sense of watching the locks to see how long they usually take, and considering only those locks which exceed the normal holding time.

In a distributed locking case, such as with Terracotta, the deadlock detector could System.exit(), let the other VM continue along, and the management system would automagically restart the victim VM. It wouldn’t prevent the problem from happening again in 10 seconds, but it might at least ring lots of bells so someone can come look at the problem, rather than having the whole cluster deadlocked. In the single VM case, we’d have to wait for a proper solution to Thread.stop(), which I also talked about yesterday.

The other question I have about deadlock detection is whether some of it can be done via static analysis, but this is not my area of expertise. An alternative would be to use AOP to instrument the locks. I’m sure someone has already done this.

One Response to “JIE: Solving the Distributed Lock Management Problem”

  1. ARI ZILKA said

    GREAT IDEA! I was planning to alert the developer to a lock ordering problem even if a deadlock doesn’t occur (edit the same object but acquire a different set of locks in a different order to do it). That would be a smoking gun / simultaneous red-herring type heuristic. There are a few more as you suggest.

    but I _love_ the idea of auto-kill. We just started down this path for other reasons but this idea is killer.

    Thanks Bob!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: