- Lecturer: Justin Sirignano
Course term: Hilary
Course lecture information: 16 lectures
Course overview:
This course provides an introduction to deep learning, covering topics such as fully-connected networks, convolution neural networks, residual networks, recurrent neural networks such as LSTMs, generative adversarial networks, and deep reinforcement learning. Optimization methods and distributed training algorithms will also be presented. Students will gain experience in using PyTorch to train deep learning models with GPUs. Mathematical analysis of neural networks, reinforcement learning, and stochastic gradient descent algorithms will also be covered in the lectures.
Course synopsis:
• Fully-connected networks, convolution networks, residual networks, and recurrent networks (LSTMs, GRUs)
• Backpropagation algorithm, gradient descent, stochastic gradient descent, mini-batch stochastic gradient descent
• Hyperparameter selection and parameter initialization
• Convex versus non-convex optimization
• Optimization methods in deep learning (RMSprop, ADAM)
• PyTorch, automatic differentiation, GPU computing
• Regularization methods (L2 penalty, dropout, ensembles, data augmentation techniques)
• Batch normalization, layer normalization
• Distributed training of models
• Introduction to HPC (i.e., communication via MPI) for distributed training of models
• Generative adversarial networks
• Deep reinforcement learning (policy gradient methods, Q-learning, actor-critic methods)
• Convergence analysis of gradient descent, stochastic gradient descent, and reinforcement learning algorithms
• Global convergence of neural networks trained with gradient descent (Neural Tangent Kernel theory)
• University approximation theory (time permitting, a proof will be presented)
• Second-order optimization methods (time permitting)
• Backpropagation algorithm, gradient descent, stochastic gradient descent, mini-batch stochastic gradient descent
• Hyperparameter selection and parameter initialization
• Convex versus non-convex optimization
• Optimization methods in deep learning (RMSprop, ADAM)
• PyTorch, automatic differentiation, GPU computing
• Regularization methods (L2 penalty, dropout, ensembles, data augmentation techniques)
• Batch normalization, layer normalization
• Distributed training of models
• Introduction to HPC (i.e., communication via MPI) for distributed training of models
• Generative adversarial networks
• Deep reinforcement learning (policy gradient methods, Q-learning, actor-critic methods)
• Convergence analysis of gradient descent, stochastic gradient descent, and reinforcement learning algorithms
• Global convergence of neural networks trained with gradient descent (Neural Tangent Kernel theory)
• University approximation theory (time permitting, a proof will be presented)
• Second-order optimization methods (time permitting)