A Hands-on Workshop series in Machine Learning
Timing: 3-5 pm PST on Tuesdays and Thursdays from Oct 20th, 2022 to Nov 10th, 2022 (7 sessions in total)
Where: Shanahan 3485 (Grace Hopper Conference room) or remotely via Zoom (link will be shared when you register)
The workshop series is designed with a focus on the practical aspects of machine learning using real-world datasets and the tools in the Python ecosystem and is targeted towards complete beginners familiar with Python.
You will learn the minimal but most useful tools for exploring datasets using pandas and then be gently introduced to neural networks. You will also learn various architectures such as Convolution Neural Networks (CNN), Recurrent Neural Networks (RNN), transformer-based models, etc., and apply them to real-world textual and image datasets.
Please register using this Google form to save your seat. It is highly recommended to attend the workshop in person as you will be coding in groups and participating in discussions, but there is an option to join remotely via Zoom. The Zoom link and the recordings for each session will be shared with the registered participants. Please have a look at the topics to be covered below. You are free to attend some of the sessions while skipping others if you are already familiar with certain topics.
The learning material and solutions will be made available in this Github repository for each session.
- Some familiarity with Python
- Basics of Probability and Statistics
- Basics of Calculus
- Basics of Linear Algebra
Here is an optional quiz to brush up your Python skills before the workshop.
Please download and install Anaconda with Python 3.8 version on your laptop ahead of the workshop.
Topics to be covered:
1. Data Manipulation using
pandas (Thursday, Oct 20th, 2022)
- Pandas dataframes as a data structure
- Indexing and slicing data frames
- Data exploration
- Basic statistical plots using
- Detecting and filling missing values
- Regular expressions for text mining
2. More on
pandas and Logistic regression (Tuesday, Oct 25th, 2022)
- More on
pandas - Groupby operations
- One hot encoding for categorical features
- Perceptron - the simplest neural network
- An exercise on implementing AND and OR gates using Perceptron by trial-and-error
3. Natural language processing (NLP) concepts (Thursday, Oct 27th, 2022)
- Natural language processing (NLP) concepts: Bag Of Words (BOW) model, TF-IDF vectorizor, using word n-grams, etc.
- Application of Logistic Regression and NLP concepts using
scikit-learn on the IMDb dataset to predict the sentiment (positive or negative) of the movie reviews
4. A Gentle Introduction to Neural Networks (Tuesday, Nov 1st, 2022)
- Neural networks: Building the intuition of the architecture and the iterative learning process
- Multi-Layer Perception: Forward and Backward propagation
- A primer on
- Training a neural network on IMDb dataset for sentiment analysis
5. Fine-tuning Neural Networks (Thursday, Nov 3rd, 2022)
- Vanishing gradients and exploding gradients in deep networks
- Activation functions
- Weight Initialization
- Regularization - L1 and L2, Dropout
- Tuning other hyper-parameters such as learning rate, number of epochs, etc.
- Exploring the TensorFlow Playground
- Application of the above concepts on IMDb dataset for training a neural network for sentiment analysis
6. Convolution Neural Networks (Tuesday, Nov 8th, 2022)
- Image preprocessing for neural networks
- Feature extraction using convolution filters
- Convolution Neural Network architecture (CNN)
- Training a CNN model on CIFAR-10 dataset
7. Recurrent Neural Networks and Transformer models (Thursday, Nov 10th, 2022)
- Recurrent Neural Networks (RNN)
- Transformer models
- Mini-project: Building a spam detector using dataset from Kaggle
This page will be updated frequently with more information.