The lectures consist of basic courses and advanced courses.
In basic lectures, we aimed to help learners completely understand the computer system architecture that supports deep learning, and learn the system design under the full life cycle of deep learning through practical problems.
In advanced lectures, we introduced cutting-edge systems and artificial intelligence research work, including AI for Systems and Systems for AI, to help learners better find and define meaningful research questions.
Course No. | Lecture Name | Remarks | |
1 | Introduction | Overview and system/AI basics | |
2 | System perspective of Systems for AI | Systems for AI: a historic view; Fundamentals of neural networks; Fundamentals of Systems for AI | |
3 | Computation frameworks for DNN | Backprop and AD, Tensor, DAG, Execution graph. Papers and systems: PyTorch, TensorFlow |
|
4 | Computer architecture for Matrix computation | Matrix computation, CPU/SIMD, GPGPU, ASIC/TPU Papers and systems: Blas, TPU |
|
5 | Distributed training algorithms | Data parallelism, model parallelism, distributed SGD Papers and systems: PipeDream |
|
6 | Distributed training systems | MPI, parameter servers, all-reduce, RDMA Papers and systems: Horovod |
|
7 | Scheduling and resource management system | Running dnn job on cluster: container, resource allocation, scheduling Papers and systems: Kubeflow, OpenPAI,Gandiva, HiveD |
|
8 | Inference systems | Efficiency, latency, throughput, and deployment Papers and systems: TensorRT, TensorflowLite, ONNX |
|
Course No. | Course Name | Remarks | |
9 | Computation graph compilation and optimization | IR, sub-graph pattern match, Matrix multiplication and memory optimization Papers and systems: XLA, MLIR, TVM, NNFusion |
|
10 | Efficiency via compression and sparsity | Model compression, Sparsity, Pruning | |
11 | AutoML systems | Hyper parameter tuning, NAS Papers and systems: Hyperband, SMAC, ENAS, AutoKeras, NNI |
|
12 | Reinforcement learning systems | Theory of RL, systems for RL Papers and systems: AC3, RLlib, AlphaZero |
|
13 | Security and Privacy | Federated learning, security, privacy Papers and systems: DeepFake |
|
14 | AI for systems | AI for traditional systems problems, for system algorithms Papers and systems: Learned Indexes, Learned query path |
|