Course Meeting Times
Lectures: 2 sessions / week, 1.5 hours / session
A list of topics covered in the course is presented in the calendar.
Description
This introductory course gives an overview of many concepts, techniques, and algorithms in machine learning, beginning with topics such as classification and linear regression and ending up with more recent topics such as boosting, support vector machines, hidden Markov models, and Bayesian networks. The course will give the student the basic ideas and intuition behind modern machine learning methods as well as a bit more formal understanding of how, why, and when they work. The underlying theme in the course is statistical inference as it provides the foundation for most of the methods covered.
Problem Sets
There will be a total of 5 problem sets, due roughly every two weeks. The content of the problem sets will vary from theoretical questions to more applied problems. You are encouraged to collaborate with other students while solving the problems but you will have to turn in your own solutions. Copying will not be tolerated. If you collaborate, you must indicate all of your collaborators.
Each problem set will be graded by a group of students with the guidance of your TAs. Each problem set will be graded in a single grading session, usually on the first Monday after it is due, starting at 5pm. Every student is required to participate in one grading session. You should sign up for grading by contacting a TA, by email or in person; doing it early increases the chances of getting the preferred grading schedule. Students who do not register for grading by the third week of the course, will be assigned to a problem set by us.
If you drop the class after signing up for a grading session, please be sure to let us know so we can keep track of students available for grading. If you add the class during the term, please remember to sign up for grading.
Exams
There will be two in-class exams, a midterm midway through the term and a final the last day of class.
Project
You are required to complete a class project. The choice of the topic is up to you so long as it clearly pertains to the course material. To ensure that you are on the right track, you will have to submit a one paragraph description of your project a month before the project is due. Similarly to problem sets, you are encouraged to collaborate on the project. We expect a four page write-up about the project, which should clearly and succinctly describe the project goal, methods, and your results. Each group should submit only one copy of the write-up and include all the names of the group members (a two person group will have 6 pages, a three person group will have 8 pages, and so on). The projects will be graded on the basis of your understanding of the overall course material (not based on, e.g., how brilliantly your method works). The scope of the project is about 1-2 problem sets.
The projects are due in Lec #23. Electronic submission is required but we can accept only postscript or pdf documents. The short proposal should be turned in on or before Lec #12.
The projects can be literature reviews, theoretical derivations or analyses, applications of machine learning methods to problems you are interested in, or something else (to be discussed with course staff).
Grading
Your overall grade will be determined roughly as follows:
| ACTIVITIES | PERCENTAGES | 
|---|---|
| Midterm | 15% | 
| Problem sets | 30% | 
| Final | 25% | 
| Project | 30% | 
Text
There are a number of useful texts for this course but each covers only some part of the class material.
Bishop, Christopher. Neural Networks for Pattern Recognition. New York, NY: Oxford University Press, 1995. ISBN: 9780198538646.
Duda, Richard, Peter Hart, and David Stork. Pattern Classification. 2nd ed. New York, NY: Wiley-Interscience, 2000. ISBN: 9780471056690.
Hastie, T., R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York, NY: Springer, 2001. ISBN: 9780387952840.
MacKay, David. Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press, 2003. ISBN: 9780521642989. Available on-line here.
Mitchell, Tom. Machine Learning. New York, NY: McGraw-Hill, 1997. ISBN: 9780070428072.
You are responsible for the material covered in lectures (most of which will appear in lecture notes in some form), problem sets, as well as material specifically made available and indicated for this purpose. The weekly recitations/tutorials will be helpful in understanding the material and solving the homework problems.
Recommended Citation
For any use or distribution of these materials, please cite as follows:
Tommi Jaakkola, course materials for 6.867 Machine Learning, Fall 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Calendar
| LEC # | TOPICS | KEY DATES | 
|---|---|---|
| 1 | Introduction, linear classification, perceptron update rule | |
| 2 | Perceptron convergence, generalization | |
| 3 | Maximum margin classification | |
| 4 | Classification errors, regularization, logistic regression | Problem set 1 out | 
| 5 | Linear regression, estimator bias and variance, active learning | |
| 6 | Active learning (cont.), non-linear predictions, kernals | Problem set 1 due | 
| 7 | Kernal regression, kernels | Problem set 2 out | 
| 8 | Support vector machine (SVM) and kernels, kernel optimization | |
| 9 | Model selection | Problem set 2 due | 
| 10 | Model selection criteria | |
| Midterm | ||
| 11 | Description length, feature selection | Problem set 3 out 3 days before Lec #11 | 
| 12 | Combining classifiers, boosting | |
| 13 | Boosting, margin, and complexity | Problem set 3 due Problem set 4 out | 
| 14 | Margin and generalization, mixture models | |
| 15 | Mixtures and the expectation maximization (EM) algorithm | |
| 16 | EM, regularization, clustering | Problem set 4 due | 
| 17 | Clustering | |
| 18 | Spectral clustering, Markov models | Problem set 5 out | 
| 19 | Hidden Markov models (HMMs) | |
| 20 | HMMs (cont.) | |
| 21 | Bayesian networks | |
| 22 | Learning Bayesian networks | Problem set 5 due | 
| 23 | Probabilistic inference Guest lecture on collaborative filtering | Projects due | 
| Final | ||
| 24 | Current problems in machine learning, wrap up | Exams back | 
