Distributed Systems : Exercise 5
Exercise 5: Monday 29.11. or Thursday 2.12.
Please note that not all of the material for this exercise exists in the 2nd edition of the course book. Look at the third set of lecture slides for definitions on optimistic and pessimistic timestamp-based methods, for example.
- Concurrency control can be lock-based, (pessimistic) timestamp-based, or optimistic (timestamp-) based.
- a) Each method assumes a certain hypothetical serial order of the transactions. Which is this order in each case?
- b) Compare the efficiency of the methods controlling serializability. Aspects to consider:
- is some method more restrictive than the others (a series of events is permitted by the method A but not by the method B)
- does some method allow more parallelism than the others
- are there other performance aspects, visible to the user, which differ?
- Three processes communicate exchanging messages. Each process Pi maintains its own vector clock Vi. All messages have a timestamp t based on the sender's clock; using t the receiver updates its own clock.
- a) Assume that as P2 receives a message from P1 the value of its clock was (5,4,2); the timestamp of the message was (7,3,5). What did P2 learn about the history of its colleagues?
- b) Show that the relation Vj[i] <= Vi[i] always holds.
- c) Show that e->e' => V(e) < V(e').
- This assignment handles the mutual exclusion problem.
- a) The central server model does not solve fulfill the "happened before" relation. Give an example.
- b) Neither does the ring-based model fulfill it. Give an example.
- c) Why does the Ricart-Agrawala model implement the "older first" principle?
- d) Process 0 wants to use resource A and process 1 wants to use resource B. Can Ricart-Agrawala's algorithm lead to deadlocks? Explain.
"Solutions"
1) Optimistic and pessimistic timestamp-based methods are rather poorly described by T&vS 2nd Edition. For supplementary material, see Answer 3 of last year's exercise 5. The presentation linked by the answer is rather good for pessimistic timestamps, whereas the lecture slides are a good source to understand the optimistic method. Of course, an even better source is the Disitributed Systems: Concepts and Design book by Coulouris, Dollimore and Kindberg. In 4th edition, the relevant pages are 545-555.
Briefly, the general idea is that Two-Phase Locking works, but it restricts concurrency / parallellism too much. A single transaction receiving a write lock prohibits a number of read-only transactions from executing. Furthermore, if the lock is acquired at the beginning of a transaction T's execution, that lock will remain at T until no more locks are to be requested, even if T has no further use for the lock. In addition, 2PL requires global state verification to detect deadlocks.
The optimistic and pessimistic methods are more lenient, since they allow operations to be interleaved more freely. The downsides are 1) more per-variable overhead in the pessimistic method and 2) a potentially resource-hungry validation phase for optimistic timestamps.
2) Pretty much straight-forward. A formal proof of c) requires induction. If somebody wants to exchange extra work for extra points, you are welcome to send me a formal proof by e-mail.
3) The idea is that some messages may be sent between processes if they don't concern the guarded critical section. However, these messages will carry the happened-before relationship as a side effect, even if unintentional. Thus:
Process A: ReqA to C; send m to B; receive r from B
Process B: receive m from A; send r to A; ReqB to C
Coordinator C: receive ReqB from B, send OK to B; receive reqA from A
Now there's clearly a happened-before relation such that ReqA => m => r => ReqB. But ReqA gets delayed and C handles ReqB first. Thus, the happened-before relation is not kept for this execution of events.
Does this make any sense? Why are there messages that are not guarded by the critical section if we still have a coordinator in effect? It is a straight-forward optimization that yields better performance, since message m and its response r have nothing to do with the critical section. In a sense, A and B can execute in a multithreaded way while waiting for Coordinator C's response (which may take a while). This can be a relaxation of the "strict" happened-before relation which allows us to break the relation for those messages that don't need ordering.
b) is exactly similar, but instead of Coordinator C think about a receiver later on in the ring and a token between the sender and the receiver. After the first message is delivered, bypassing the token, the token shifts to the receiver who has now become interested.
c) Why => How: Lamport clocks carry information about the older request, updating receivers' clocks to a later timestamp.
d) Without further assumptions no. If process 0 also holds resource B and process 1 holds resouce A, yes.