CS 610: Programming for Performance (Semester 2025-26-I)

Class hours: MF 8-9 AM Tu 9-10 AM in KD 101

Instructor Information

Name Email
Swarnendu Biswas swarnendu@cse.iitk.ac.in

TA Information

Name Email
Srinjoy Sarkar srinjoys23@cse.iitk.ac.in
Sahil Basia sahilbasia24@cse.iitk.ac.in
Nayan Das nayand24@cse.iitk.ac.in
Sangharsh Nagdevte sangharsh@cse.iitk.ac.in

Course Description

To obtain good performance, one needs to write correct but scalable parallel programs using programming language abstractions like threads. In addition, the developer needs to be aware of and utilize many architecture-specific features like vectorization to extract the full performance potential. This course will discuss programming language abstractions with architecture-aware development to learn to write scalable parallel programs. This is not a "programming tips and tricks" course.

We will have 4-6 assignments to use the concepts learned in class and appreciate the challenges in extracting performance.

Prerequisites
  • Exposure to the following courses (or equivalent) is desirable: CS220 (Computer Organization), and CS330 (Operating Systems).
  • Programming maturity with popular programming languages like C, C++, and Java.

Syllabus

The course will focus on a subset of the following topics.

We may add new, drop existing, or reorder topics depending on progress and class feedback. The course may also involve reading and critiquing related research papers.

Policies

Evaluation Scheme

The following is a tentative allocation and might change slightly depending on the strength of the class. Grading is relative.

Assignments 40%
Midsem 30%
Endsem 30%

Academic Integrity

Feedback

I am open to feedback about the course content and presentation. Feel free to provide suggestions for improvements.

Resources

Date Topic Resources Recommended Reading
First course handout FCH
01/08, 04/08 Compiler Challenges for Parallel Architectures Slides AK 1.1-1.6
05/08, 08/08, 09/08 Cache Memory Slides HP APP B.1-B.4, 2.1--2.3
CSAPP 6.2-6.4
11/08, 12/08 Write Cache-Friendly Code Slides CSAPP 6.5-6.6
DRAG 11.1-11.2
12/08 PAPI Library Slides
18/08, 19/08, 22/08 Cache Coherence and False Sharing Slides MCM Chapters 2, 6, 8 (IITK has subscribed to the ebook)
23/08, 25/08, 26/08 Shared-Memory Synchronization Slides MP 2.3, 2.4, 2.6, 7.1-7.5, 8.3
SMS 4.1, 4.2, 4.3.1, 6.1
Dependence Testing Slides AK Chap 2, 3
DRAG 11.6
Loop Transformations AK 5.2-5.4, 5.7.2, 5.9, 6.2.1, 6.2.2, 6.2.5, 6.3.1-6.3.4
AP 4.1, 4.2, 4.5, 5.1-5.6
HP 4.5
Compiler Transformations for High-Performance Computing
Vectorization HP 4.1-4.2
Program Optimization Through Loop Vectorization
Topics in Loop Vectorization
OpenMP PP Chapter 5 (IITK has subscribed to the ebook)
LLNL OpenMP Tutorial
Introduction to OpenMP - Tim Mattson (Intel)
OpenMP Application Programming Interface v5.2
OpenMP Application Programming Interface Examples v5.2
GPU Architecture and CUDA Programming NVIDIA CUDA C Programming Guide
NVIDIA CUDA C Best Practices Guide
KH Chapters 1-5,13,20 (IITK has subscribed to the ebook)
HP 4.4
Concurrent Data Structures MP Chapters 3, 9, 10, 13 (IITK has subscribed to the ebook)

References

I have listed (NOT in any particular order) a few popular references.
[CSAPP] Computer Systems: A Programmer's Perspective, 3rd edition - R. Bryant and D. O'Hallaron
[DRAG] Compilers: Principles, Techniques, and Tools - A. Aho, M. Lam, R. Sethi, and J. Ullman
[HP] Computer Architecture: A Quantitative Approach, 6th edition - J. Hennessy and D. Patterson
[AK] Optimizing Compilers for Modern Architectures - R. Allen and K. Kennedy
[SMS] Shared-Memory Synchronization, 2nd edition - M. Scott and T. Brown.
[AP] Automatic Parallelization: An Overview of Fundamental Compiler Techniques - Samuel P. Midkiff
[PP] An Introduction to Parallel Programming - Peter S. Pacheco
[KH] Programming Massively Parallel Processors: A Hands-on Approach, 3rd edition - David B. Kirk and Wen-mei W. Hwu
[MCM] A Primer on Memory Consistency and Cache Coherence, 2nd edition - Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill and David A. Wood
[MP] The Art of Multiprocessor Programming, 1st edition - Maurice Herlihy and Nir Shavit

We may read and discuss related materials and research papers, which we will announce in class.