Introduction to Machine Learning : Examinations

Separate examinations

To take this course by a separate examination, you must additionally either

  1. have done the required homework on some previous instance of the lecture course (for example, have at least 50% of the homework points from the Autumn 2015 course) or
  2. do a separate programming project.

If you did the required homework in Autumn 2015, the exam will automatically be treated as a renewal exam. Your homework points will carry over. Your grade will be re-calculated using the separate exam instead of the course exam, and the better score will stand.

If you did the required homework for the course earlier than Autumn 2015, please contact the lecturer (Jyrki Kivinen).

If you have never completed the required homework for the course, please contact the lecturer about the programming assignment required for option 2 before the exam you plan to attend. The deadline for turning in the assignment will be one month after the exam date.

The information below about what material you should read, what to bring to the exam etc. was originally written for the course exam in Autumn 2015, but it is valid also for all separate exams based on the Autumn 2015 course instance, which in practice means all separate exams during calendar year 2016 (unless there are later changes to the exam schedule.)

What topics are covered?

The exam will be based on all material presented in lectures, homework exercises and the related parts of the textbook.

In practice, the lecture slides give a very good indication of what topics you should understand when you come to the exam. However, the slides are a bit sketchy and lack illustrations etc. so in order to actually understand the material, you should also read the textbook.

There are pointers to what parts of the textbook correspond to each part of the lectures in the slides and in the lecture descriptions.

 

Which parts of the textbook are required?

Prologue and Chapter 1: You should read all of this. However try not to get stuck if some technical points seem difficult, since most of this will be explained in more detail later. The exception is the kernel trick (Example 1.9) which is also mentioned several times in later parts of the book but we did not discuss at all in our course, so you should feel free to ignore it.

Chapter 2: You should read everything except Section 2.3. From 2.3 we only need the basic techniques explained on page 75 (empirical probabilities, Laplace correction and pseudo-counts).

Chapter 3: You need to understand basic multiclass classification and 1-vs-rest and 1-vs-1 techniques (pages 82–83), regression (Section 3.2 except Equation (3.2) and related mathematical details) and clustering (pages 95–99).

Chapter 5: Read Section 5.0 and 5.1. Regarding ranking and probabilistic trees, you only need to understand the basic technique of how to get them (bottom of page 141). You should also understand reduced error pruning (Algorithm 5.3, discussed on pages 142–143).

Chapter 6: Read Sections 6.0, 6.1 and 6.2 except for the subsections about ranking and probability estimation. For ranking and probability estimation, it is sufficient to understand the basic trick shown in Example 6.4.

Generally in Chapters 5 and 6 you can ignore the detailed discussion of ROC curves for trees and rules, which are a bit more advanced approach than what we considered in our course.

Chapter 7: You should read all of Sections 7.0 and 7.1. From 7.2, read everything except for the dual perceptron (Algorithm 7.2). Section 7.3 is not really a part of the course, although we did discuss the basic ideas of SVM very briefly. Rest of the Chapter is not included.

Chapter 8: Read everything except Section 8.6 (kernels again).

Chapter 9: Read entire Sections 9.0, 9.1 and 9.2. From 9.3 you can ignore the mathematical derivations on pages 284–285 but should read the rest.

From the rest of the textbook, we covered only isolated pieces:

  • matrix decompositions (pages 324–326): we did discuss this during the last week of lectures, but only very briefly, and it will not be in the exam.
  • cross-validation (pages 349–350)

 

What about the lecture slides?

Everything in the lecture slides is required material, except for the following few slides that we skipped:

  • slides 58–59 (about chess)
  • slides 124–126 about the statistical learning model: this is useful for understanding the notion of generalisation, but if you find this type of mathematics unfamiliar, you don't need to worry about this
  • slides 139–140 about bias-variance for regression (this is related to Equation 3.2 in the textbook): in the context of our course this is an unnecessary technical detail which we skipped in lectures
  • slides 151 and 152 give a very quick sketch of Bayesian model selection and MDL and can be ignored here
  • slides 306–317 about matrix decompositions are extra material that we did discuss briefly in lectures during the last week, but this will not be in the exam

In particular you should notice some parts of lecture slides that cover topics that are spread over several parts of the textbook and are perhaps given less detail there:

  • Bayes optimality (slides 89–101)
  • evaluating model performance (slides 118–152)
  • Perceptron convergence theorem (slides 214–215): the exact formulation is not too important for our purposes, but the notion of margin on which it is based is central for understanding also SVM and other similar techniques
  • pocket algorithm (slides 218–220) is not in textbook, but you need something like this to actually use the Perceptron on real-world data that generally is not linearly separable

 

What to bring to the exam?

As with all the exams at the department, you should bring writing materials (pencil etc. but not your own paper) and some means of identification (student card, passport etc.).

Additionally, to the exams of this course, you may bring a "cheat sheet" which is one hand-written A4 sheet, to which you can write whatever information you think might be useful in the exam, using both sides if you wish. Even if you don't think you'll really need a cheat sheet in the exam, you may wish to create one just to help clarify to yourself what you think the important things are.  You are not allowed to bring any other written material.

You will not need, and should not bring, a calculator.  Use of any electronic devices, including of course mobile phones, is prohibited.

 

What will be asked in the exam?

In the exam, you may be asked to

  • briefly define and explain key concepts and terms
  • explain algorithms, techniques and other broader topics, possibly answering "what," "why" and "how" questions
  • simulate an algorithm on a (very small) data set
  • make basic mathematical calculations and derivations
  • something else relevant to the content and learning objectives of the course.

 

Sample exams

Some old exams are linked below to give you an idea of what to expect. Notice that the content of course has changed a bit, and in particular we have a new textbook, so the old questions may not be all directly applicable to the current version of the course.