Distributed Systems : Workshop 4

Form groups of 2-4 people to discuss the following tasks.

1. Discuss clock synchronization in the group. Why do we need clock
synchronization, and what kinds of approaches do you know for achieving this?

2. What are logical scalar and vector clocks, and what is the motivation behind
them? (Why don't we just get our clocks set right and use that for everything?)
Find examples of applications where a logical clock would be a better choice,
and of applications where real time clocks are at their best.

3. Mutual exclusion, mostly within one computer, has been discussed at length in
the local Concurrent Programming course. What is the goal of mutual exclusion?
Name a few example situations where incorrect handling of mutual exclusion (or
not handling it at all) can cause problems.

4. Survival skills: Introduction to proving that an algorithm has some features.

a) Proof by induction as an example technique:

We assume that you have run into this before - show that something works for a
given baseline x0, and that whenever it applies to an x >= x0, it will also
apply for "x + 1" or the equivalent*.

Prove by induction (on the number of nodes) that in a distributed algorithm
where every node reports its clock reading to every other node regularly, the
number of messages exchanged in one such round is n * n (= n^2).

(Once you model the algorithm as a graph, it may be tempting to use
graph theory to argue for the result without going through induction. But
that is essentially leaning on someone else having proved facts about
the underlying graph mathematics before you.)

*) Mathematicians have made a contract that with this combination, we can prove
that a claim holds for all possible x >= x0, when x is enumerable - you cannot
apply induction to real numbers or other non-numerable groups, because it is
a tool for discrete things. But then most algorithms are that.

b) False proof by misuse of imagination:

You are peer reviewing a study that gives the following argument:

"Given that we got the expert Master's students taking the Distributed Systems
course to provide proof (see attached flawless answers) that the Ricart-Agrawala
distributed mutual exclusion algorithm works correctly, it follows that a
straightforward implementation of the algorithm suits perfectly for handling
mutual exclusion of the automated handling of control rods at our next nuclear
power plant."

Identify a few extra assumptions that the authors have made on top of the formal
proof that may lead to a nuclear disaster in a situation where it turns out they
are wrong.

(Also in formal proofs, it is important to know and preferrably explicitly
mention what is assumed. Similar to documentation of code, it makes the proofs
themselves more useful and flaws in them easier to spot.)

5. Survival skills: quick introduction to messaging with netcat

Recall earlier warmup exercises on connecting to Ukko cluster nodes. Set up an
ssh connection to two different nodes (in two different windows).

Pick an arbitrary network port number under 65 000, for example based on the
five last numbers of your student ID (if the first of them is 0 or >= 6, replace
it with something better suited). Replace this number to <yourport> in the
following commands.

On one node, start netcat ('man nc') in server (listening) mode:

nc -l <yourport>

From the other node, connect to this server:

nc <hostname> <yourport>

For example, if the server is on ukko167 and your port number is 45326, this
would be 'nc ukko167 45326'.

Once you've established the connection, type "Hello world!" and other
interesting things there. Stop the connection by pressing Ctrl-C in either the
client or the server; observe how this disconnects the other one as well.

We will return to netcat in later exercises and homework assignments, and it is
likely to come in handy in later life on this specialization line as well. If
you are familiar with telnet, the netcat client mode is not that different, but
it can be more easily used in shell scripts.