Course catalog description: This course, which is open to all engineering and non-engineering majors, introduces students to the fundamentals of machine learning through a blend of mathematical and statistical descriptions, hands-on programming exercises, and real-world engineering problems. Additional emphasis is placed on discussing various practical aspects of machine learning systems that include ethics and bias.
Credits and contact hours: 3 credits, 160 instructional minutes per week for 14 weeks
Pre-Requisite courses: Pre-requisites for this course include undergraduate-level probability theory and linear algebra courses. For probability, these courses are currently considered sufficient preparation: 14:332:226 - Probability and Random Processes, 14:540:210- Engineering Probability, 01:960:211 Statistics I, 01:960:401 - Basic Statistics, 01:640:477 Math Theory of Probability, or 01:198:206 Intro to Discrete Structures II. For linear algebra, the following course is considered appropriate 01:640:250 - Introductory Linear Algebra preparation, though it is noted that linear algebra background is covered in a variety of other courses across the university and hence students may explain how they have had linear algebra in other courses to the instructor for permission to take the proposed course.
Co-Requisite courses: none
Topics Covered: Introduction to machine learning, its basic terminology, and the machine learning pipeline; feature engineering and feature/representation learning; principal component analysis; basic building blocks of machine learning algorithms; classification algorithms such as Bayes' classifier, naive Bayes' classifier, linear discriminant analysis, quadratic discriminant analysis, nearest-neighbor classifier, logistic regression, perceptron, and support vector machines; regression algorithms such as least-squares regression, ridge regression, and lasso regression; clustering algorithms such as K-means clustering and Gaussian mixture model clustering; practical aspects of machine learning systems such as underfitting and overfitting, cross-validation for parameter tuning, numerical optimization, and privacy, ethics, and bias.
Textbook(s):
- G. James, D. Witten, T. Hastie and R. Tibshirani, An Introduction to Statistical Learning with Applications in R, Springer; 1st ed. 2013, Corr. 7th printing 2017 edition (September 1, 2017)
- T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer; 2nd edition (2016)