Somebody tweeted Distributed transactions in Spring, with and without XA today, and I read it with interest because of my fascination with distributed transactions.
One of the patterns the author left out is how to provide the 2PC-like atomicity for a database and a messaging resource, but without using XA.
This is called “1.5PC” or “one-and-a-half phase commit.”
In a simple application that takes messages from a queue and updates a database, there are four combinations of behavior when things go wrong, two of which are acceptable and two of which are not:
- The message is still on the queue but the database has not been updated, which means the crash happened before the database was updated and the message was removed from the queue. These operations can be safely be reissued. This is considered acceptable.
- The message is no longer on the queue, and the database is updated, which means that both operations completed properly before the crash. This is considered acceptable.
- The message is still on the queue, and the database is updated, a bad scenario that happens when the crash happens after the database is updated but before the message is removed.
- The message is no longer on the queue, and the database hasn’t been updated, a bad scenario which happens when the message is removed from the queue before the database is updated.
Scenarios 3 and 4 are really the same problem – one resource gets updated, but not the other – and can really be considered one problem, with the order of the resource operations reversed.1.5PC requires that the messaging system and the database share a unique identifier. It could be an SequenceID or a TransactionID, but it has to be unique for all time, and the unique identifier does not have to be monotonically increasing (but it helps).
After a crash, the receiver recovers by looking at the head of the message queue, and using the unique identifier to determine if the message was already applied to the database. If the database updated has been applied, the message can be safely removed from the queue. If it has not been applied, the database is updated with the contents of the message, and the message removed from the queue.
If you’re using Mule, you can get the same effect declaratively using its Multi-TX feature (although shame on Mule for the registration wall).
But this is only one half of the equation: it prevents duplicate application of the same message to the database, but it doesn’t prevent the message sender from putting two copies of the same messages on the queue (“dupes”) or failing to put a message on the queue (“gaps”). This can be fixed in the sender with similar coordination between the source database and the message queuing system, which is left as an exercise for the reader.
(Thanks to @RossMason of MuleSoft for providing feedback on an earlier version of this post.)