ml-systems
Machine Learning for Computer Systems
View the Project on GitHub
noise-lab/ml-systems
Lecture 1
Basics
Slack
Github Classroom
Canvas/Github
Logistics (October 11)
Coverage of syllabus
Course objectives
Topic outline
Due dates
What’s new this year
Project
Lecture on course material
Introduction to Computer Networks
Packet Capture
Hands-On 1: Packet Capture
Wireshark basics and installation
Getting started with Jupyter, etc.
Lecture 2
Introduction (Slides)
Motivating Applications
Security
Hands-On Activities
Packet Capture
Learning Objectives
Wireshark Setup
Notebook Setup
Packet Capture
Packet Capture to Pandas
Analysis
Security Applications (Slides, Discussion)
Lecture 3
Security Hands-On
More Motivation
Application Quality of Experience
Overview of Assignment 1
Application quality hands-on (?)
Resource Provisioning Motivation (no hands-on)
Project Team Formation Time (if needed)
Lecture 4
Prof. Feamster out of town
Project office hours
Research in Networks/ML (Taveesh and Andrew)
Lecture 5
Resource allocation applications
Video bitrate adaptation (QoE)
QoE/Service Identification Hands-On (not completed in class)
Lecture 6
Active and passive measurement
Advantages and disadvantages of active and passive measurement
Infrastructure considerations
Measurements when you want them
Systems costs considerations
Privacy considerations
Feature extraction from packet captures
What is a flow? (5-tuple)
Hands-On Activity
Packet Statistics Extraction - Flow Statistics (Manual)
Lecture 7
Hands-On Activity
Packet Statistics Extraction - Flow Statistics (netML)
Data Preparation and ML Pipelines
Hands-On Activity
Data Preparation
Lecture 8
Hands-On Activity
Data Preparation and Model Training (#6)
ML Pipelines
Training and testing
Train-test split
Cross-validation
Hyperparameter tuning
Evaluation metrics
Hands-On Activity
ML-Pipeline (#7)
Midterm Topics Stop
Here
(Nothing below here!)
Lecture 9
Hands-On Activity
ML-Pipeline (#7)
Evaluation Metrics
Accuracy
Precision
Recall
ROC
AUC
Supervised Learning Overview
Naive Bayes
Lecture 10
In-Class Midterm
Lecture 11
Linear Regression
Hands-On Activity (#10 Linear Regression)
Lecture 12
Logistic Regression
Hands-On Activity (#11 Logistic Regression)
Lecture 13
Decision Trees and Ensembles
Advantages and disadvantages of decision trees
Random Forests
Bagging / Design
Advantages of Random Forest over Decision Trees
Lecture 14
What is representation learning?
Deep Learning
Neural Networks
Backpropagation
Lecture 15
Dimensionality Reduction
Motivation for Dimensionality Reduction
Visualization
Computation/Training Time
Interpretability
Noise Reduction/Model Robustness
Example Dimensionality Reduction Techniques
PCA
t-SNE
PVA vs. t-SNE - when to use which?
Lecture 16
Clustering
K-means
GMM
Hierarchical Clustering
DBSCAN
Hands-On Activity (#15 Clustering)
Lecture 17
Bit-level representation of network data (nPrint)
Motivation
Applications
Challenges
Generative AI
GANs
Transformers
Stable Diffusion
Reasons and motivation to use generative AI for network data
Data augmentation
Privacy constraints