Supervised Machine Learning : Additional references

Perustiedot

Textbooks

There are many general textbooks on machine learning. The one with a point of view closest to this course is

Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification (2nd edition). Wiley 2001.

There is a relatively new book on online learning that covers all the basics and a lot of more recent research:

Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, Learning, and Games. Cambridge University Press 2006.

The following book was our main source for kernel methods and Rademacher complexity, and has a lot more on those topics:

John Shawe-Taylor and Nello Cristianini: Kernel Methods for Pattern Analysis, Cambridge University Press 2004

More traditional approaches to statistical learning theory can be found in

Luc Devroye, Lászlo Györfi and Gàbor Lugosi.A Probabilistic Theory of Pattern Recognition.. Springer 1996.
Trevor Hastie, Robert Tibshirani, Jerome Friedman. The Elements of Statistical Learning. Springer 2001.

If you want more background on convex optimisation (one of the main ingredients of Support Vector Machines), the following modern textbook is available freely online:

Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press 2004

Tutorials

Online learning, including generalisations of the Perceptron algorithm, but not the expert framework:

Jyrki Kivinen. Online learning of linear classifiers, In S. Mendelson and A. J. Smola, editors, Advanced Lectures on Machine Learning, pages 235–257, Springer LNCS 2600, January 2003

Summary of recent work in statistical learning theory, including Rademacher complexity:

Shahar Mendelson. A few notes on statistical learning theory. In Advanced Lectures on Machine Learning, pages 1–40, Springer LNCS 2600, January 2003.

There is also a nice introductory article on SVMs and related kernel methods:

Bernhard Schölkopf and Alexander J. Smola. A short introduction to learning with kernels. In Advanced Lectures on Machine Learning, pages 41–64, Springer LNCS 2600, January 2003.

Boosting was not covered in the course but is closely related:

Ron Meir and Gunnar Rätsch. An Introduction to Boosting and Leveraging. In Advanced Lectures on Machine Learning, pages 118–183, Springer LNCS 2600, January 2003.

Original research articles

The Weighted Majority algorithm is introduced and analysed in

N. Littlestone and M. K. Warmuth. The weighted majority algorithm, Information and Computation 108(2):212–261, February 1994.

The Aggregating Algorithm is due to Vovk. One of his articles on the topic is

V. Vovk. A game of prediction with expert advice, Journal of Computer and System Sciences 56(2):153–173, April 1998.

The special case of absolute loss is covered in great detail (including tuning with static learning rate and with a very fancy doubling trick) by

Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire and Manfred K. Warmuth. How to use expert advice, Journal of the ACM 44(3):427–485, May 1997.

Self-confident tuning comes from

Peter Auer, Nicolò Cesa-Bianchi and Claudio Gentile. Adaptive and self-confident on-line learning algorithms, Journal of Computer and System Sciences 64(1):48–75, February 2002.

For multiplicative algorithms (not covered in the course) see for example

Nick Littlestone. Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm, Machine Learning 2(4):285–318, April 1988.
Nicolò Cesa-Bianchi. Analysis of two gradient-based algorithms for on-line regression, Journal of Computer and System Sciences 59(3):392–411, 1999.

For conversion from online to batch algorithm, see

Nicolò Cesa-Bianchi, Alex Conconi and Claudio Gentile. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory 50(9):2050-2057, September 2004.

A nice proof for the connection between VC dimension and Rademacher complexity is given in

Matti Kääriäinen. Relating the Rademacher and VC bounds. University of Helsinki, Department of Computer Science, Report C-2004-57, 2004.

Journals, conferences and web sites

Machine Learning
Journal of Machine Learning Research
Conference on Learning Theory (COLT, formerly Workshop on Computational Learning Theory), organised by Association for Computational Learning
International Conference on Machine Learning (ICML)
Neural Information Processing Systems (NIPS)
Support Vector Machine homepage

Tulostettava sivu

Osoite: Tietojenkäsittelytieteen laitos, PL 68 (Gustaf Hällströmin katu 2b), 00014 Helsingin yliopisto
Aukioloajat: Normaalisti syys- ja kevätlukukausien aikana ma - pe klo 7.45-19.45.
Puhelin: 0294 1911 (yliopiston vaihde)
Sähköposti: Palveluosoitteet
Faksi: 09 876 4314

Kirjaudu sivulle | Webmaster

Department of Computer Science [pre 2018 site]

Helsingin Yliopisto

Matemaattis-luonnontieteellinen tiedekunta

Supervised Machine Learning : Additional references

Textbooks

Tutorials

Original research articles

Journals, conferences and web sites