Distributed Systems : Exam and grading

The course exam (PDF).

General notes on exam aesthetics:

It's hard to have time for everything in the exam, but I think it is commonly accepted that when you sit down, you can put margins on your answer papers before the exam itself starts. The following are preferences to standardize answering style to the Finnish matriculation examination standard; they don't affect grading but they reduce grader headache and cause positive feelings.

1) Please use the sheets like they were each an individual (preferrably western-style) book. That is, leave the fold/crease on the left side, start on the "cover", then use both inside pages (as 2 individual pages), and end with the back cover. Then grab another paper. Rinse and repeat.

2) Please leave a margin on the outside edge of about 3 squares/columns, and on the fold edge of the square/column that gets messed up by the fold. Use the outside edge margin to mark which task you're starting. (It also leaves space for grading notes.) You can draw these margins for extra neatness. :)

3) Please leave a space on the "cover" page of each sheet of about 4 rows where you put the usual name and such data. On the front page sheet, this space is helpful for writing down the grading as well. Particularly on bachelor-level exams, students are even encouraged to draw a little (K+1) x 2 grid there for points per task (K tasks) and their sum (+1). This helps keep grading notes ordered.

4) Writing on every line is ok. Writing a lot more than one line per grid row gets a biiiit tight. We have trouble reading dense handwriting more than large.

5) If you have leftover time after you have finished your answers (not applicable in all exams ;)), it pays to spend it reading through your answer to check if the text makes sense also to an external reader.

General notes on grading:

The exam was squeezed pretty tight. It was the first computer science exam I had witnessed in the new 2,5-hour system and the first one I'd designed. While I was prepared for this and checked with a colleague that the whole was still sane, I probably could have gotten away with less questions. But it doesn't matter that much, in the end, if you have to give incomplete answers due to limited time. I adjusted the grading based on reading the answers people had had time to write. Exams are, unfortunately, a flawed measurement device to begin with. After publishing the course results I know it looked grade-wise pretty "usual" on the high level, but it's an island in the sense that we honestly have no point of reference.

Due to the limited time, we had no need to punish people for wandering into irrelevant topics in their exam answers: the scheduling punishes for it already.  The typical grading system at the department seems to be that if you bring up the right things, you get points for them, and we don't care what the rest is. Your mileage may vary, but that was pretty much what we did here too.

Communication errors in exam situations are common, because it's an abnormal and reasonably high-stress situation. It is always a problem to correctly read through the answers into what happens in the mind that produces them.

If I had to name one thing that caused the most "point loss", it would be the infamous missing "why", i.e. an explanation that turns a list of buzzwords into evidence of understanding what they mean. For example, if you say "replication improves performance", just add the couple of words about "because/when the load is distributed between more than one node". The latter is the important part, the former is the summary that may not mean anything. As a similar real-world example, one course feedback comment said, "Hope it will be better." (and nothing else). From this, an external reader can see that there was something that was involved with good and bad ("improves performance"/hope for improvement), but not yet where the good/bad come from or, in fact, what they are. It is hard to identify what words are considered to contain a lot of information and which ones are so general they can be understood in many ways; this is one of the reasons why one-sided monologial exam answer writing is so error-prone as a measurement technique.

In this case, "replication improves something" was already in the exam question, so the only meaningful word was 'performance' and it's a bit ambiguous what performance is in this case (does it speed up the Internet or the processing, improve number of clients served in a given time, make the performance more even (reliable/predictable), even reduce errors or improve content quality like in a more abstract meaning of performance, etc). So when in doubt, write it out. When no doubt is present, blame differences in thinking. Communication is hard!

The lists below are checking aids: if some single thing got forgotten to mention but otherwise the expression of understanding was excellent, for example, common sense was applied. The purpose of publishing this list is that you have some idea what were looking for in the answers, but we are generally not able to follow a hard-coded scale with tasks that can take the answerer to many equally valid places.

Some task-specific notes:

Goals and challenges of distribution:

- 1 point for sensible definition, 1 point for having some clue about the
  effect, and third point for overall demonstrating understanding in the answer;
  2 is reasonable for answering well but very specifically only to the effect
  in a given application area, because I asked for a definition explicitly too.

a) heterogeneity: different computers, operating systems, browsers,
network connections
- web service designer has to take the differences into account
- for example user interface suiting different screen sizes, or usability
  for lower-bandwith network connections for email, or whatever

b) openness: adding new nodes/implementations/etc is
possible/straightforward - achieved generally through standards, public
interface definitions, loose coupling etc.
(the difference between 2 and 3 points: thinking mostly about how it's
achieved and not coming to think at all about what the big idea of the means,
e.g. standards, is - i.e. being "open" to third parties, new nodes being
able to join, etc)
- P2P designer: can new nodes join the network while it's in operation?
  do they all have to have the same client program?
- lot of people referred to open source here; it's not enough by itself but
  if you interpret openness to mean public inspectability, we're getting
  somewhere

c) scalability: can handle increase in load / users / pointers / resources
(/ nodes).
- name service: may need to handle a high number of names, queries, users

2. Reliable communication

a) Multicast: one-to-n but not to all, unicast 1-to-1, broadcast
one-to-all. (1-2 pt) Difference to n unicasts is that infrastructure
(middleware/os/whatever) reduces the amount of complexity the sender has to
handle (particularly: can send one message and it "magically splits" into
n, like email within an organization). (0-1, vague coverage of this
aspect is acceptable).

Justified idea why it's either so hellish to achieve in large scale it's
more an academic ideal than something in use in the internet, OR idea that
it's actually supported in the real world in small enough circles that it's
usable in special cases; anyway, some mention of if it's actually in use (1
pt).

- Generalizing multicast to all middleware/infrastructure-based "send to x
  recipients" is a reasonably valid but not compulsory interpretation of the
  course teaching. It varied in practice.

b) main challenges (6 points):
- messages may be lost on the way (or get delayed, corrupted, etc small stuff)
  - solution: some kind of acknowledgement scheme (arguments about its
    scalability are optional, can cover for missing notes on other parts)
- group members may change (particularly crash)
  - solution: group membership maintenance (heartbeat protocols etc.
    have not been a part of the official slides, just mentioned in passing
    in lectures and possibly in exercises), *some* idea that trying to
    detect a group member leaving would be needed
- consistency: finding some (appropriate level of) agreement about what
  messages have been sent/received and in what order
  - solution: some idea about virtual synchrony (i.e. group changes and
    messages synchronized so that if a group member leaves/joins, a stable
    group all receive it or no one considers it received, and some idea
    about ordering, e.g. timestamps. (Extra points for remembering
    logical clocks. ;))

- The last was seen in practice pretty little.

3. Distributed decision making
a) e.g. distributed mutual exclusion, coordinator/initiator elections,
   distributed commit of a transaction, distributed checkpointing...
   (point per essentially sane example)
b) (9 points doesn't split very evenly for two, alas, use last point for
   overall sanity or something and 4 per algorithm)
   - e.g. distributed mutex: Ricart-Agrawala (name is pointless, just name
     is 1 point, not remembering the name doesn't matter)
     - node asks everyone, waits for permission from everyone, only
       proceeds if gets that, i.e. none of them are using/waiting for resource
       (2 points, can get the extra point if also mention the  
       anti-starvation measure that the messages are timed and after release
       handled in order...)

- I was surprised how a couple of answers used two-phase commit as
  an example to solving two rather different things, but the application was
  fun.

- In some cases there was a very practical example like booking flight
  tickets, but no additional explanation to convince us that the student
  knows what kind of a more abstract "problem class" it belongs to; this
  kind of answer was hard to grade, and probably it cost points. Try to
  demonstrate understanding of the course topics in particular.

4) Replication and consistency

a) + scalability (can handle higher loads)
   + performance (can have servers closer to the user -> better throughput etc)
   + fault tolerance (the entire service is not down if one copy is down)
   (weight: 4-5 points)
   - consistency problems: e.g. two users see different version of video,
     same user may not see the same version from different locations or
     whatever (weight: 1-2 points, a good general view is priority)

b) sequential: all updates seen in the same order across nodes
   entry: data made consistent before entry to critical section (usually
          modification in particular)
          - beyond that, it can be in whatever state, we don't care
   weak: data can be in whatever state unless we explicitly ask for it to
         be synchronized
   eventual consistency: data can be in whatever state, but updates will
         eventually make their way through (and system reaches consistent
         state if ever left in peace for long enough, essentially)
   (3-4 points: 3 if minimal summaries from lecture slides given)
   - picking a consistency model and a couple words for why that make
     sense: 2 points,
     except if this explanation extends the minimal summaries above, in which
     case it can cover the points not gained above.
     - e.g. eventual consistency is enough, because minimal effort to implement
       and a video-sharing service like youtube is not very critical data
     - or weak consistency is good, can synchronize when needed but doesn't
       impact performance
     - or sequential consistency is good, because it doesn't take that much
       effort because only one user generally updates one video... (ok, this
       may be stretching it)