Distributed Systems : Homework assignment 2

You can use bash-scripting, Java, or Python. C is okay, but we recommend using the aforementioned three. You need to include a document that gives the instructions on how to run your program and describes the execution process. Default submission format is an archive. Submission and more detailed instructions on Moodle: https://moodle.helsinki.fi/course/view.php?id=8142

Implement a simple coordinated checkpoint algorithm (see 8.6, or
optionally the Chandy-Lamport distributed snapshot algorithm in the
slides). If you use Chandy-Lamport, assume the nodes are organized in
a chain: o-o-o-o.

The implementation should be from the point of one node who is called to
make a snapshot: it should wait for a message to tell it to take a  
snapshot, save its state when it receives it, and either send a reply
(coordinated) or tell the next node to take a snapshot (Chandy-Lamport).

(For the purpose of this exercise, we can assume incoming messages do
not need to be stored as a part of the checkpoint.)

We will reuse this code in homework assignment 4, so actually
implementing it in a scripting language that can call external programs
(like nc, hostname and uptime) is recommended.

The state of node is saved to a file; in this exercise it should include
four variables:
- two integer variables, 'number' and 'result',
- the node's hostname (a string) and
- the current uptime and load of the node (also a string).

You can get the hostname and uptime with the commands 'hostname' and
'uptime' respectively. (Alternatively you can use a configuration file
to specify fixed values for these.)

Test your algorithm on two different Ukko nodes. The same code should
work independent of which computer you run it on, and you should get two
different saved states in two different files as a result.

The state should be saved to a file named <hostname>-<date>.txt as
key-value pairs that you can use to restore the state later, for
example ukko034-01102012.txt.

cat ukko034-01102012.txt :

number = 4
result = 5
hostname = ukko034
time = "16:48:58 up 1 day,  3:03,  5 users,  load average: 0.66, 0.24, 0.18"

Clarification 29 Oct 2012: Several students have asked what the two integer variables and '(up)time' data variable are for and what should be in them. The point is to show that you can save state of both string and basic integer variables in this exercise; the content of the variables can be as above or something else, we don't really care. Using these specific variables makes the script directly applicable in homework exercise 4, where we first run another protocol, then use this script to store its result.

Also, it's ok to set up a script that has to be manually started on each of the different Ukko nodes - in other words, you don't need a top-level script that starts scripts on _different_ computers for this exercise.