Lecture Tu/Th 1pm-2:30pm: Location SEC104 Instructor: Panruo Wu (pwu7@uh.edu)
Reference books:
Reference online resources
As modern hardware increasingly depends on explicit parallelism to deliver performance, modern computer scientists and practitioners need to acquire programming and parallelization skills to develop efficient and scalable systems and applications. General purpose GPUs (GPGPUs) are the most accessible, powerful, and widely used platforms in several scenarios where performance is critical, including large neural network training/inference, advanced cryptography, and scientific/engineering simulation.
This course teaches fundamentals on GPGPU architecture, parallel computation principles, and the methods of achieving very high performance on GPU via programming and performance engineering. The main vehicle of teaching programming on GPU is CUDA on NVIDIA GPUs. The goal of this course is to be able to identify problems that can be parallelized effectively and realize parallel computation on massively parallel systems such as GPUs, and techniques to achieve very high performance on GPUs.
This will be a pretty hands-on course--so get ready to get hands dirty! A NVIDIA GPU machine will be provided for you to use for this course. The main programming tools are CUDA on NVIDIA GPUs.
This schedule is tentative! It will be adjusted frequently. Last modified: Jan 22, 2025
Week | Tuesday | Thursday | Assignment |
---|---|---|---|
Week 1 | Jan 14: Introduction to GPU and CUDA Lec1: Intro | Jan 16: Data Parallelism & CUDA/C Lec2 Data Parallel | |
Week 2 | Jan 21 UH closed due to weather. No class | jan 23 Lec3: Grid and Data | |
Week 3 | Jan 28 | Jan 30 Lec4: Architecture and Scheduling | |
Week 4 | Feb 4 Lec5: Memory and Data Locality | Feb 6 | |
Week 5 | Feb 11 Lec6: Lec6: Performance Considerations | Feb 13 | Feb 16 Assignment1 Due: Matrix Transpose |
Week 6 | Feb 18 Lec7: Reductions | Feb 20 Class cancelled today | |
Week 7 | Feb 25 Lec8: Scan | Feb 27 Lec9: Histogram | |
Week 8 | Mar 4 | Mar 6 Lec10: Merge | Mar 9 Assignment2 Due: Matrix Multiplication |
Week 9 | Mar 10-15 Spring Holiday | ||
Week 10 | Mar 18 Lec11: Sorting | Mar 20 Lec12: Graph Traversal (BFS) | |
Week 11 | Mar 25 Lec13: Sparse Matrix Computation | Mar 27 Lec14: CUDA Dynamic Parallelism |
Mar 30
|
Week 12 | Apr 1 Lec15: OpenCL | Apr 3 | Apr 6 Assignment3 Due: Circle Renderer |
Week 13 | Apr 8 Lec16: SYCL | Apr 10 | |
Week 14 | Apr 15 Lec17: Deep Neural Networks | Apr 17 | |
Week 15 | Apr 22 | Apr 24 | |
End of April Semester |