Class hours: MF 8-9 AM Tu 9-10 AM in KD 103
Name | Swarnendu Biswas |
swarnendu AT cse.iitk.ac.in |
Name | |
Srinjoy Sarkar | srinjoy@cse.iitk.ac.in |
Puspesh Kumar Srivastava | puspeshk24@iitk.ac.in |
To obtain good performance, one needs to write correct but scalable parallel programs using programming language abstractions like threads. In addition, the developer needs to be aware of and utilize many architecture-specific features like vectorization to extract the full performance potential. This course will discuss programming language abstractions with architecture-aware development to learn to write scalable parallel programs. This is not a “programming tips and tricks” course.
We will have 4-6 assignments to use the concepts learned in class and appreciate the challenges in extracting performance.
Prerequisites |
|
The course will focus on a subset of the following topics.
I am open to feedback about the course content and presentation. Feel free to provide suggestions for improvements.
Assignments | 40% |
Midsem | 30% |
Endsem | 30% |
Date | Topic | Resources | Recommended Reading |
---|---|---|---|
First Course
Handout |
|||
30/07, 02/08 | Compiler Challenges for Parallel Architectures | Slides | AK 1.1-1.6 |
05/08, 06/08, 09/08 | Cache Memory | Slides |
HP APP B.1-B.4, 2.1--2.3 CSAPP 6.2-6.4 |
12/08, 13/08 | Write Cache-Friendly Code | Slides |
CSAPP 6.5-6.6 DRAG 11.1-11.2 |
16/08 | PAPI Library | Slides | |
19/08, 20/08, 23/08, 27/08, 30/08 |
Cache Coherence and False Sharing | Slides | MCM Chapters 2, 6, 8 (IITK has subscribed to the ebook) |
02/09, 03/09, 06/09 | Dependence Testing | Slides |
AK Chap 2, 3
DRAG 11.6 |
07/09, 09/09, 10/09, 13/09 | Loop Transformations | Slides |
AK 5.2-5.4, 5.7.2, 5.9, 6.2.1, 6.2.2, 6.2.5, 6.3.1-6.3.4 AP 4.1, 4.2, 4.5, 5.1-5.6 HP 4.5 Compiler Transformations for High-Performance Computing |
23/09, 24/09, 27/09 | Vectorization | Slides |
HP 4.1-4.2 Program Optimization Through Loop Vectorization Topics in Loop Vectorization |
30/09, 01/10, 04/10, 14/10 | OpenMP | Slides |
PP Chapter 5 (IITK has subscribed to the ebook) LLNL OpenMP Tutorial Introduction to OpenMP - Tim Mattson (Intel) OpenMP Application Programming Interface v5.2 OpenMP Application Programming Interface Examples v5.2 |
15/10, 18/10, 21/10, 22/10, 25/10, 26/10, 28/10, 29/10, 01/11 |
GPU Architecture and CUDA Programming | Slides |
NVIDIA CUDA C Programming Guide NVIDIA CUDA C Best Practices Guide KH Chapters 1-5,13,20 (IITK has subscribed to the ebook) HP 4.4 |
02/11, 04/11, 05/11, 08/11, 11/11, 12/11 |
Concurrent Data Structures | Slides | MP Chapters 3, 9, 10, 13 (IITK has subscribed to the ebook) |