ml-systems

Machine Learning for Computer Systems

View the Project on GitHub noise-lab/ml-systems

Lecture 1

Basics
- Slack
- Github Classroom
- Canvas/Github
- Logistics (October 11)
Coverage of syllabus
- Course objectives
- Topic outline
- Due dates
- What’s new this year
- Project
Lecture on course material
- Introduction to Computer Networks
- Packet Capture
Hands-On 1: Packet Capture
- Wireshark basics and installation
- Getting started with Jupyter, etc.

Lecture 2

Introduction (Slides)
- Motivating Applications
- Security
Hands-On Activities
- Packet Capture
  - Learning Objectives
    - Wireshark Setup
    - Notebook Setup
    - Packet Capture
    - Packet Capture to Pandas
    - Analysis
Security Applications (Slides, Discussion)

Lecture 3

Security Hands-On
More Motivation
- Application Quality of Experience
- Overview of Assignment 1
- Application quality hands-on (?)
Resource Provisioning Motivation (no hands-on)
Project Team Formation Time (if needed)

Lecture 4

Prof. Feamster out of town
Project office hours
Research in Networks/ML (Taveesh and Andrew)

Lecture 5

Resource allocation applications
- Video bitrate adaptation (QoE)
QoE/Service Identification Hands-On (not completed in class)

Lecture 6

Active and passive measurement
- Advantages and disadvantages of active and passive measurement
  - Infrastructure considerations
  - Measurements when you want them
  - Systems costs considerations
  - Privacy considerations
- Feature extraction from packet captures
- What is a flow? (5-tuple)
Hands-On Activity
- Packet Statistics Extraction - Flow Statistics (Manual)

Lecture 7

Hands-On Activity
- Packet Statistics Extraction - Flow Statistics (netML)
Data Preparation and ML Pipelines
Hands-On Activity
- Data Preparation

Lecture 8

Hands-On Activity
- Data Preparation and Model Training (#6)
ML Pipelines
- Training and testing
- Train-test split
- Cross-validation
- Hyperparameter tuning
- Evaluation metrics
Hands-On Activity
- ML-Pipeline (#7)
Midterm Topics Stop Here (Nothing below here!)

Lecture 9

Hands-On Activity
- ML-Pipeline (#7)
Evaluation Metrics
- Accuracy
- Precision
- Recall
- ROC
- AUC
Supervised Learning Overview
Naive Bayes

Lecture 10

In-Class Midterm

Lecture 11

Linear Regression
Hands-On Activity (#10 Linear Regression)

Lecture 12

Logistic Regression
Hands-On Activity (#11 Logistic Regression)

Lecture 13

Decision Trees and Ensembles
Advantages and disadvantages of decision trees
Random Forests
- Bagging / Design
- Advantages of Random Forest over Decision Trees

Lecture 14

What is representation learning?
- Deep Learning
- Neural Networks
- Backpropagation

Lecture 15

Dimensionality Reduction
Motivation for Dimensionality Reduction
- Visualization
- Computation/Training Time
- Interpretability
- Noise Reduction/Model Robustness
Example Dimensionality Reduction Techniques
- PCA
- t-SNE
- PVA vs. t-SNE - when to use which?

Lecture 16

Clustering
- K-means
- GMM
- Hierarchical Clustering
- DBSCAN
Hands-On Activity (#15 Clustering)

Lecture 17

Bit-level representation of network data (nPrint)
- Motivation
- Applications
- Challenges
Generative AI
- GANs
- Transformers
- Stable Diffusion
Reasons and motivation to use generative AI for network data
- Data augmentation
- Privacy constraints