Machine Learning for Computer Systems

Course Description

This course will cover topics at the intersection of machine learning and systems, with a focus on applications of machine learning to computer systems. Topics covered will include applications of machine learning models to security, performance analysis, and prediction problems in systems; data preparation, feature selection, and feature extraction; design, development, and evaluation of machine learning models and pipelines; fairness, interpretability, and explainability of machine learning models; and testing and debugging of machine learning models.

The topic of machine learning for computer systems is broad. Given the expertise of the instructor, many of the examples this term will focus on applications to computer networking. Yet, many of these principles apply broadly, across computer systems.

You can and should think of this course as a practical hands-on introduction to machine learning models and concepts that will allow you to apply these models in practice. We’ll focus on examples from networking, but you will walk away from the course with a good understanding of how to apply machine learning models to real-world datasets, how to use machine learning to help computer systems operate better, and the practical challenges with deploying machine learning models in practice.

Syllabus

More details are in the course syllabus.

Class agenda for each meeting is in the agenda.

Schedule

Lecture	Topic	Reading	Other
1	Introduction (Packet Capture)	Ch. 1
Use Cases
2	Security (Scanning)	Ch. 2.1
3	Performance (QoE Inference)	Ch. 2.2
4	Resource Optimization	Ch. 2.3
Data From Computer Systems
5	Data Acquisition (Data Acquisition)	Ch. 3.2–3.3
6	From Data to Analysis (Feature Extraction)	Ch. 3.4
Machine Learning Pipeline
7	Data Preparation and Representation (Data Preparation)	Ch. 4.1
8	Model Training and Evaluation (Model Evaluation)	Ch. 4.2–4.3
Supervised Learning		Ch. 5
9	Non-Parametric and Probabilistic Models (Naive Bayes)
10	Linear and Polynomial Regression (Linear Regression)
11	Logistic Regression and SVMs (Logistic Regression)
12	Trees and Ensembles (Trees and Ensembles)		Midterm/Take-Home
13	Deep Learning (Deep Learning)
Unsupervised Learning		Ch. 6
14	Dimensionality Reduction (Dimensionality Reduction)
15	Clustering (Clustering)
16	Autoencoders (Autoencoders)
Generative Models		Ch. 7
17	Transformers (Large Language Models)
18	Diffusion and State-Space Models (Diffusion Models)
Bonus Topics
19	Reinforcement Learning (Automation)	Ch. 8
20	Timeseries Analysis (Timeseries)
21	Automation (Automation)	Ch. 9
22	Model Performance and Maintenance

Please come to class having done the reading.

Background Videos and Readings

The material below is strictly optional unless otherwise noted, although you may find it useful.